

# Europäisches Patentamt **European Patent Office**

Office européen des brevets



EP 0 506 111 B1 (11)

(12)

# **EUROPEAN PATENT SPECIFICATION**

(45) Date of publication and mention of the grant of the patent: 12.04.2000 Bulletin 2000/15

(51) Int. Cl.7: G06F 17/16

(21) Application number: 92105359.1

(22) Date of filing: 27.03.1992

# (54) DCT/IDCT processor and data processing method

Diskreter/invers-diskreter Cosinus-Transformationsprozessor und Datenverarbeitungsverfahren Processeur de calcul d'une transformée discrète/inverse-discrète du cosinus, et procédé de traitement de données

(84) Designated Contracting States: **DE FR NL** 

(30) Priority: 27.03.1991 JP 6325991

(43) Date of publication of application: 30.09.1992 Bulletin 1992/40

(73) Proprietor: MITSUBISHI DENKI KABUSHIKI KAISHA Tokyo (JP)

(72) Inventors:

 Uramoto, Shinichi, c/o Mitsubishi Denki K.K. LSI Itami-shi, Hyogo-ken (JP)

· Inoue, Yoshitsugu, c/o Mitsubishi Denki K.K. LSI Itami-shi, Hyogo-ken (JP)

(74) Representative:

Prüfer, Lutz H., Dipl.-Phys. Patentanwalt, Dipl.-Physiker Lutz H. Prüfer, Dr. Habil. Jürgen Materne, Harthauser Strasse 25d 81545 München (DE)

(56) References cited: EP-A- 0 275 979

> TENCON '89, 22 - 24 NOVEMBER 1989, BOMBAY INDIA pages 74 - 77 S.N. MERCHANT ET AL. 'Distributed arithmetic architecture for image coding'

 1989 IEEE SYMPOSIUM ON CIRCUITS AND SYSTEMS, 8 - 11 MAY 1989, PORTLAND US, vol.1 V. RAMPA ET AL. "Computer-aided synthesis of a bi-dimensional discrete cosine transform chip'

 INTERNATIONAL JOURNAL OF ELECTRONICS, vol.69, no.2, August 1990, LONDON GB pages 233 - 246 K.W. CURRENT ET AL. 'Unified forward and inverse discrete cosine transform architecture and proposed VLSI implementation'

Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention).

### Description

5

[0001] The present invention relates generally to data processors and data processing methods and, more particularly, to an apparatus and method for carrying out discrete cosine transform or inverse cosine transform of data.

[0002] In order to process video data at a high speed, high effective coding is carried out. In high effective coding, a data amount of a digital video signal is compressed with picture quality being maintained as high as possible. In high effective coding, a redundant component of the signal is first removed for efficient coding. For this purpose, orthogonal transform techniques are often employed. As one of the orthogonal transform techniques, discrete cosine transform DCT is provided. The DCT is implemented by a simple product sum operation using a cosine function as a coefficient. The DCT is defined by the following expression (1):

$$Y = AX \tag{1}$$

where X is an N-term column vector indicating input data, Y is an N-term column vector indicating output data, and A is N by N coefficient matrix represented by the following expression.

$$A(i, j) = \sqrt{\frac{2}{N}} \cdot C(i) \cdot \cos \frac{(2i + 1) j}{2N} \pi$$

$$C(i) \begin{cases} = \sqrt{1/2} & (i = 0) \\ = 1 & (i \neq 0) \end{cases}$$
i, j = 0, 1, ... N - 1

[0003] The expression (1) represents a case where input data X is of N terms. 2<sup>m</sup> points are generally employed, where m is a natural number. A description will now be made on 8 point DCT where N = 8 (m = 3). As can be seen from the expression (1), DCT is a matrix operation, and in practice, this processing is realized by product sum operation.

[0004] Fig. 1 shows configuration of a conventional DCT processor. This DCT processor is described in, for example, <u>IEEE</u>, Proceedings of Custom Integrated Circuits Conference 89, 1989, pp. 24.4.1 to 24.4.4.

[0005] Referring to Fig. 1, the conventional DCT processor includes eight sum product operation units 100a to 100h arranged in parallel for calculating respective terms y0 to y7 of output data Y.

[0006] Each of product sum operation units 100a to 100h is of the same configuration and includes a parallel multiplier 101 for taking a product of input data xi (i = 0 to 7) and a predetermined weighting coefficient, and an accumulator 102 for accumulating an output of parallel multiplier 101 to generate output data yj (j = 0 to 7). Here, reference characters 101 and 102 generically denote respective components 101a to 101h and 102a to 102h. In the following description also, reference numerals having no suffixes generically denote corresponding elements.

[0007] Accumulator 102 includes a 2-input adder 103 for receiving an output of parallel multiplier 101 at its one input, and an accumulating register 104 for latching an output of adder 103. An output of register 104 is applied to an output terminal 106 and also to the other input of adder 103. Data yj of the respective terms of output data Y are sequentially output through a selector not shown from output terminal 106. An operation will now be described.

[0008] Identical data are applied through an input terminal 105 to product sum operation units 100a to 100h. The following arithmetic operation is carried in each of product sum operation units 100a - 100h:

$$yj = \sum_{i=0}^{7} A(i, j) xi$$

$$= \frac{1}{2} \sum_{i=0}^{7} C(j) \cdot (\cos \frac{(2i+1)}{16} j\pi) \cdot xi$$
(2)

i, j = 0, 1, ... 7

50

For example, data y0 of a zeroth term in an output data vector Y is calculated as follows in product sum operation unit 100a.

[0009] When receiving zeroth-term data x0 (hereinafter referred to simply as input data) in an input data vector, parallel multiplier 101a outputs a product A (0, 0) · x0 of data x0 and a coefficient A (0, 0) to adder 103a. Register 104a is being reset, and the content thereof is 0. Accordingly, product A (0, 0) · x0 is output from adder 103a and then stored in register 104a.

When input data x1 is applied, a product A (1, 0) • x1 is output from multiplier 101a. An output of adder 103a is A (0, 0) • x0 + A (1, 0) • x1 and stored in register 104a.

[0011] By repetition of such an operation, an output of accumulator 102a provided after application of input data x7 is

Σί

A (i, 0) • xi, so that output data y0 is obtained. Similar calculation (which differs merely in values of a weighting coefficient A (i, j)) is carried out also in the remaining product sum operation units 100b - 100h, and output data y1 - y7 are obtained. These output data y0 - y7 are sequentially output through output terminal 106.

[0012] In contrast to the DCT operation, there is an inverse DCT operation for carrying out the inverse operation of the DCT operation. The inverse DCT (IDCT) is expressed as follows:

X = A'Y

20

10

where an input data vector X is obtained from an output data vector Y. That is, the only the difference between the DCT operation and the IDCT operation is a difference between coefficients A and A'. Thus, in the configuration of Fig. 1, the IDCT operation can be carried out by changing the coefficients in parallel multipliers 101a - 101h.

[0013] In other words, the DCT and the IDCT can be carried out on the same hardware. An increase in hardware is only concerned with a control circuit (not shown) for making a selection between a coefficient for DCT and that for IDCT.

[0014] The above-described one-dimensional DCT operation can be expanded to a two-dimensional DCT operation. The two-dimensional DCT operation is obtained by making both input data vector X and output data vector Y be two-dimensional vectors.

[0015] Fig. 2 shows configuration of a conventional two-dimensional DCT (or IDCT) processor. Referring to Fig. 2, the processor includes a first one-dimensional DCT processing section 111a for subjecting input data from input terminal 105 to one-dimensional DCT processing, a transposition circuit 112 for rearranging rows and columns of an output of first one-dimensional DCT processing section 111a, and a second one-dimensional DCT processing section 111b for subjecting an output of transposition circuit 112 to one-dimensional DCT processing. First one-dimensional DCT processing section 111a performs a DCT (or IDCT) operation in a row direction, and second one-dimensional DCT processing section 111b performs a DCT (or IDCT) operation in a column direction.

[0016] Fig. 3 is a diagram showing configuration of the transposition circuit of Fig. 2. Referring to Fig. 3, transposition circuit 112 includes a buffer memory 121 and an address generation circuit 122 for generating write/read addresses of buffer memory 121. Buffer memory 121 receives output data of first-one-dimensional DCT processing section 111a through an input terminal 125 and sequentially stores the same therein in accordance with an address signal from address generation circuit 122. Also, buffer memory 121 applies corresponding data from an output terminal 126 to second one-dimensional DCT processing section 111b in accordance with an address signal from address generation circuit 122. An operation will now be described. Input data X and output data Y are two dimensional, the elements of which are each represented by x (i, j) and y (i, j), i, j = 0, 1 ... 7.

Input data are applied in the order of rows to first one-dimensional DCT processing section 111a. More specifically, input data are applied to input terminal 105 in the order of 8-term row vectors x (0, j), x (1, j), ... x (7, j).

[0018] First one-dimensional DCT processing section 111a performs the DCT operation for each row vector to output intermediate data Z. At that time, first DCT processing section 111a outputs intermediate data of row vectors in the order of rows, i.e., z (0, j), z (1, j) .... Accordingly, a DCT operation in the row direction of input data X is carried out.

[0019] As shown in Fig. 3, transposition circuit 112 first stores the intermediate data from first DCT processing section 111a into buffer memory 121 in the order of receiving of the intermediate data (the order of rows).

[0020] Then, intermediate data Z are read in the order of columns, i.e., the order of column vectors z (i, 0), z (i, 1) ... from buffer memory 121.

[0021] Intermediate data Z read in the order of columns are applied to second DCT processing section 111b. Second DCT processing section 111b carries out on the intermediate data one-dimensional DCT processing. Accordingly, data subjected to one-dimensional DCT processing in the column direction are output from second one-dimensional DCT processing section 111b. Output data Y from second one-dimensional DCT processing section 111b are output in the order of columns from output terminal 106. As a result, two-dimensional DCT shown by the following equation (3)

is performed.

25

Yuv = 
$$\frac{1}{4} \sum_{i=0}^{7} \sum_{j=0}^{7} C(u) \cdot C(v) \cdot \cos \frac{(2i+1) u\pi}{16}$$
.

$$\cos \frac{(2j+1) v\pi}{16} \cdot Xij \qquad ... (3)$$

$$C(u), C(v) = \begin{cases} \frac{1}{\sqrt{2}}, & (u, v = 0) \\ 1, & (u, v \neq 0) \end{cases}$$

[0022] First and second DCT processing sections 111a and 111b carry out the same processing except for coefficients in the parallel multiplying circuits. If multiplication coefficients of first and second DCT processing sections 111a and 111b are changed, two-dimensional IDCT shown by the following equation (4) is carried out.

$$Xij = \frac{1}{4} \sum_{u=0}^{7} \sum_{v=0}^{7} C(u) \cdot C(v) \cdot \cos \frac{(2i+1) u\pi}{16} \cdot \cos \frac{(2j+1) v\pi}{16} \cdot Yuv$$
 (4)

[0023] The DCT processing and IDCT processing as shown above include a product sum operation. A product operation of this product sum operation is carried out by the parallel multipliers shown in Fig. 1. A multiplier in general requires a large number of adders and the like and has a large scale. Thus, there is a disadvantage that a conventional DCT processor requiring a plurality of parallel multipliers is not allowed to be sized-down.

[0024] In a semiconductor integrated circuit for carrying out a synchronization operation, the upper limit of operation speed is determined by a worst delay path (the path which provides a maximum delay). In the conventional configuration, the worst delay path is established by a parallel multiplier, and the operation speed depends on processing speed of the parallel multiplier. It is thus difficult to implement a fast DCT processing and a fast IDCT processing.

[0025] From International Journal of Electronics, vol. 69, no. 2, August 1990, London, pages 233 - 246; K.W. Current et al.: "Unified forward and inverse discrete cosine transform architecture and proposed VLSI implementation" a processor and a method for selectively carrying out one-dimensional discrete cosine transform and inverse discrete cosine transform are known. The processor comprises a preshuffle, a butterfly, a multiplier, a pre/postprocessor, a scalar and a postshuffle. The processor performs the product sum operations using the parallel operational units (butterfly and hardware multipliers). For DCT, post-processing is required, whilst for IDCT pre-processing is required.

[0026] From Tencon '89, 22 - 24 November 1989, India, pages 74 - 77; S.N. Merchant et al.: "Distributed Arithmetic Architecture for Image Coding" a discrete cosine transform processor is known performing a two-dimensional discrete cosine transform by column conversion following the row conversion.

[0027] From 1989 IEEE Symposium on Circuits and Systems, 8/11 May 1989, Portland, Oregon, vol. 1, pages 220 - 225; V. Rampa et al.: "Computer-Aided Synthesis of a Bi-Dimensional Discrete Cosine Transform Chip" a discrete cosine transform processor is known for carrying out bi-dimensional discrete cosine transform, wherein a serial multiplier and accumulator performs the matrix operation using a look-up table.

[0028] It is the object of the present invention to provide a down-sized data processor which operates at high speed for carrying out at least one of discrete cosine transform and inverse discrete cosine transform.

[0029] This object is solved by a processor with the features of claim 1 or 12 and by a method having the features of claims 21 or 22.

[0030] Preferred developments of the processor and the method are given in the respective subclaims.

[0031] Since the number of times of multiplication is reduced and no parallel multipliers are employed, the DCT operation and IDCT operation are carried out at a high speed with fewer circuit components.

[0032] The foregoing and other objects, features, aspects and advantages of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the

### accompanying drawings.

5

10

20

25

45

50

55

43

- Fig. 1 is a diagram showing configuration of a conventional one-dimensional DCT processor.
- Fig. 2 is a diagram showing configuration of a conventional two-dimensional DCT processor.
- Fig. 3 is a diagram showing configuration of a transposition circuit of Fig. 2.
  - Fig. 4 is a diagram showing configuration of a one-dimensional DCT processor being one embodiment of the present invention.
  - Fig. 5 is a diagram showing an example of configuration of a preprocessing section shown in Fig. 4.
  - Fig. 6 is a diagram showing an example of modification of the preprocessing section of Fig. 5.
  - Fig. 7A is a diagram showing an example of configuration of a data rearranging circuit of Fig. 4.
    - Fig. 7B is a diagram showing the contents of a shift register of Fig. 7A.
    - Fig. 8 is a diagram showing an example of configuration of a product sum operation circuit of Fig. 4.
    - Fig. 9 is a diagram of an example of modification of the product sum operation circuit of Fig. 8.
    - Fig. 10 is a diagram showing an example of modification of the one-dimensional DCT processor of Fig. 1.
- 15 Fig. 11 is a diagram showing-configuration of a one-dimensional IDCT processor being another embodiment of the present invention.
  - Fig. 12 is a diagram showing configuration of a one-dimensional DCT/IDCT processor being still another embodiment of the present invention.
  - Fig. 13 is a diagram showing configuration of a two-dimensional DCT processor being still another embodiment of the present invention.
  - Fig. 14 is a diagram showing configuration of a two-dimensional IDCT processor being still another embodiment of the present invention.
  - Fig. 15 is a diagram showing configuration of a two-dimensional DCT/IDCT processor being still another embodiment of the present invention.
  - Fig. 16 is a diagram showing configuration of a semiconductor integrated circuit apparatus including the DCT processor of the present invention.
  - Fig. 17 is a diagram showing an example of modification of the semiconductor integrated circuit of Fig. 16.
  - [0033] Fig. 4 schematically shows configuration of a one-dimensional DCT processor being one embodiment of the present invention.
  - [0034] Referring to Fig. 4, the processor includes a preprocessing section 1 for receiving input data xi from an input terminal 4 to preprocess the received input data xi on the basis of characteristics inherent to DCT operation, a data rearranging circuit 2 for rearranging data output from preprocessing section 1, and a product sum operation section 3 for carrying out a product sum operation on data from data rearranging circuit 2.
  - [0035] This processor carries out an eight-point DCT operation. Thus, product sum operation section 3 includes eight product sum operation circuits 6a 6h. Respective product sum operation circuits 6a 6h provide respective output data y0, y2, y4, y6, y1, y3, y5 and y7 to sequentially apply the output data to an output terminal 5, (the sequential application unit is not shown in the figure).
  - [0036] A description will now be made on the principle of an 8-point one-dimensional DCT processing operation of the present invention before a detailed description of configuration of each section. If the relationship between input data xi (i = 0, 1, ... 7) and output data yj (j = 0, 1, ... 7) shown in equations (1) and (2) is expressed in a matrix form, the following representation (5) is obtained:

5

20 where

25

35

$$A = \frac{1}{2}\cos\frac{\pi}{4}, \ B = \frac{1}{2}\cos\frac{\pi}{8}, \ C = \frac{1}{2}\sin\frac{\pi}{8}, \ D = \frac{1}{2}\cos\frac{\pi}{16}, \ E = \frac{1}{2}\cos\frac{3\pi}{16}, \ F = \frac{1}{2}\sin\frac{3\pi}{16}, \ G = \frac{1}{2}\sin\frac{\pi}{16}$$

[0037] In derivation of the above relation (5), the well-known known characteristics of trigonometric function such as  $\cos \pi/4 = 1/\sqrt{2}$ ,  $\cos (\pi \pm \theta) = -\cos \theta$ ,  $\cos (\pi/2 \pm \theta) = \sin \theta$  and the like are utilized. For example,  $\cos (3\pi/8) = \sin (\pi/8)$  and the like are utilized.

[0038] In the above relation (5), a coefficient matrix is horizontally symmetrical with respect to columns. By use of this symmetry, relation (5) can be transformed to the following representation (6):

$$\begin{bmatrix} y_0 \\ y_2 \\ y_4 \\ y_6 \end{bmatrix} = \begin{bmatrix} A & A & A & A \\ B & C & -C & -B \\ A & -A & -A & A \\ C & -B & B & -C \end{bmatrix} \begin{bmatrix} x_0 + x_7 \\ x_1 + x_6 \\ x_2 + x_5 \\ x_3 + x_4 \end{bmatrix} \begin{bmatrix} y_1 \\ y_3 \\ y_5 \\ y_7 \end{bmatrix} = \begin{bmatrix} D & E & F & G \\ E & -G & -D & -F \\ F & -D & G & E \\ G & -F & E & -D \end{bmatrix} \begin{bmatrix} x_0 - x_7 \\ x_1 - x_6 \\ x_2 - x_5 \\ x_3 - x_4 \end{bmatrix}$$
(6)

[0039] If a comparison is made between the above relations (5) and (6), it is apparent that the number of times of multiplication for acquiring output data yj is reduced to a half in relation (6) as compared to relation (5). DCT processing in accordance with relation (6) is carried out in this embodiment.

[0040] With reference to Fig. 4, preprocessing section 1 generates the following eight intermediate data from input data xi sequentially applied from input terminal 4 by selectively carrying out addition or subtraction. The intermediate data are:

$$(x0 + x7)$$
,  $(x1 + x6)$ ,  $(x2 + x5)$ ,  $(x3 + x4)$ ,  $(x0 - x7)$ ,  $(x1 - x6)$ ,  $(x2 - x5)$  and  $(x3 - x4)$ .

[0041] The result of preprocessing from preprocessing section 1 is represented in finite word length. In the following description, it is assumed that the preprocessing result is indicated by 8-bit data in two's complement notation.

[0042] In order to calculate output data yj by using the preprocessed data from preprocessing section 1, the matrix operation of relation (6) is carried out.

[0043] With respect to output data y2, for example, the following relation (7) is carried out:

55

45

$$y2 = B \cdot (x0 + x7) + C \cdot (x1 + x6) - C \cdot (x2 + x5) - B (x3 + x4)$$

$$= \sum_{k=1}^{4} B_k \cdot z_k$$
(7)

where

5

15

25

35

45

10  $B_1 = B, B_2 = C, B_3 = -C, B_4 = -B$   $z_1 = (x0 + x7), z_2 = (x1 + x6)$   $z_3 = (x2 + x5), z_4 = (x3 + x4)$ 

[0044] Data rearranging circuit 2 of Fig. 4 receives preprocessing results  $z_k$  (k = 1, ... 4) from preprocessing section 1. When receiving four necessary preprocessing results  $z_k$ , data rearranging circuit 2 outputs the least significant bits of respective four preprocessing results  $z_k$  in parallel to product sum operation section 3. The parallel bit output is carried out sequentially in bit figure until the most significant bit is output.

[0045] Product sum operation circuit 6b for output data y2 carries out an operation in accordance with a relation (8) which is a further equivalent transformation of relation (7).

$$y2 = \sum_{n=1}^{7} \left[ \sum_{k=1}^{4} B_{k} \cdot z_{kn} \right] 2^{-n} + \sum_{k=1}^{4} B_{k} \cdot (-z_{ko})$$
 (8)

where  $z_{kn}$  is nth-bit data of preprocessing result  $z_k$ , and  $z_{k0}$  is the most significant bit of  $z_k$ . That is,  $z_k \langle 0|7 \rangle = (z_{k0}, z_{k1}, ... z_{k7})$ . Data  $z_k$  is obtained in the following relation (9):

$$z_k = -z_{k0} + \sum_{n=1}^{7} z_{kn} \cdot 2^{-n}$$
 (9)

It should be noted that data  $z_k$  is data of 8 bits represented in two's complement notation. Therefore, equations (7) and (8) are mathematically totally equivalent to each other except for a difference in order of product sum operations.

[0046] 4-bit data  $z_{1n}$ ,  $z_{2n}$ ,  $z_{3n}$ , and  $z_{4n}$  are applied in parallel to product sum operation circuit 6b for data y2. The values of coefficients B1, B2, B3 and B4 can be calculated in advance. Product sum operation circuit 6b stores therein a partial sum (10) shown below in the form of a ROM table and outputs a corresponding partial sum with 4-bit data  $z_{1n}$ ,  $z_{2n}$ ,  $z_{3n}$  and  $z_{4n}$  used as an address.

$$\sum_{k=1}^{4} B_{k} \cdot z_{kn} \qquad (n = 0, ... 7)$$
 (10)

This partial sum is accumulated by an internal accumulator, so that output data y2 is output from terminal 5. Although the sign of the most significant bit  $b_{k0}$  is negative, this sign can be converted to be positive for addition operation in two's complement notation.

[0047] Referring to Fig. 4, in product sum operation section 3, product sum operation circuits 6a - 6d apply the operation to the same data  $z_k$  in parallel to produce output data y0, y2, y4 and y6. Product sum operation circuits 6e - 6h apply the operation to the same data  $w_k$  in parallel to produce output data y1, y3, y5 and y7.

[0048] Output data y0 to y7 are sequentially output in this order from output terminal 5 by a selector not shown. A description will now be given on a detailed configuration of each section shown in Fig. 4.

[0049] Fig. 5 shows configuration of preprocessing section 1 shown in Fig. 4. Preprocessing section 1 includes an input circuit 21 for receiving input data xi sequentially applied from input terminal 4. Input circuit 21 outputs input data xp and xq in a predetermined combination under control by a control circuit 25. Here, a relation p + q = 7 is satisfied.

Input circuit 21 can be formed of a tapped shift register. Data at a desired stage can be read by selecting a tap under control by control circuit 25 by employing, for example, a multiplexer.

[0050] Preprocessing section 1 further includes a 2-input adder 22 for adding outputs of input circuit 21, a subtractor 23 for subtracting outputs of input circuit 21, and an output circuit 24 for selecting one of respective outputs of adder 22 and subtractor 23 under control by control circuit 25. Adder 22 and subtractor 23 carry out addition and subtraction for the applied data under control by control circuit 25.

[0051] Output circuit 24 preferably alternately selects the output of adder 22 and that of subtractor 23. An operation will now be described.

[0052] Input circuit 21 receives input data X to sequentially output sets of data (x0, x7), (x1, x6), (x2, x5), and (x3, x4).

[0053] Adder 22 adds the data of each set. Adder 22 sequentially outputs data  $z_k$ , i.e., (x0 + x7), (x1 + x6), (x2 + x5) and (x3 + x4).

[0054] Subtractor 23 sequentially outputs data  $w_k$ , i.e., (x0 - x7), (x1 - x6), (x2 - x5) and (x3 - x4).

[0055] Output circuit 24 alternately outputs data zk and data wk.

10

[0056] Parallel multiplication circuits 6a to 6d of Fig. 1 carry out an operation in accordance with data z<sub>k</sub>, while parallel multiplication circuits 6e to 6h carry out an operation in accordance with data w<sub>k</sub>.

[0057] Output circuit 24 alternately outputs addition data  $z_k$  and subtraction data  $w_k$ . This makes it possible to produce output data y0 to y7 in this order from product sum operation section 3 and implement a pipelined architecture for processing data in synchronization with a clock signal.

[0058] In that case, it is unnecessary that adder 22 and subtractor 23 carry out an arithmetic operation simultaneously. Accordingly, as shown in Fig. 6, an arithmetic unit 26 for alternately performing the adding processing and the subtracting processing under control by control circuit 25 may be employed. Output circuit 24 does not have to have a selecting function in the configuration of Fig. 6. Output circuit 24 is required to have a function of buffering and latching (in the case of a clock synchronizing operation) an output of arithmetic unit 26. In the configuration of Fig. 6, since the addition and subtraction are carried out in a single arithmetic unit 26, the circuit scale is reduced.

[0059] Such configuration may be employed that intermediate data  $w_k$  is output after all intermediate data  $z_k$  are output from preprocessing section 1.

[0060] Fig. 7A shows configuration of data rearranging circuit 2 of Fig. 1. Data rearranging circuit 2 includes an input circuit 31 for receiving intermediate data from a terminal 500, a shift register 32 for sequentially storing therein data from input circuit 31, and a selector 33 for sequentially reading four intermediate data stored in shift register 32 from the least significant bit.

[0061] After alternately receiving intermediate data  $z_k$  and intermediate data  $w_k$  and outputting all of intermediate data  $z_k$  in advance, input circuit 31 sequentially outputs intermediate data  $w_k$ . This configuration can easily be implemented by using a register for storing intermediate data  $w_k$  therein.

[0062] When intermediate data  $w_k$  are applied after all of intermediate data  $z_k$  are applied, input circuit 31 sequentially inputs intermediate data from terminal 500. Input circuit 31, however, has a function of latching intermediate data  $w_k$  until the reading of intermediate data  $z_k$  by the selector is completed. A data acceptation, latching and output operation of input circuit 31 is controlled by a control circuit 34.

[0063] Shift register 32 stores therein four intermediate data from input circuit 31. Shift register 32 includes four 8-bit registers 32a-32d in a row direction as shown in Fig. 7B. Fig. 7B shows a state where four intermediate data z<sub>1</sub> to z<sub>4</sub> are stored in shift register 32.

[0064] Intermediate data  $z_k$  from input circuit 31 are sequentially stored in registers 32a - 32d. After all intermediate data z1 to z4 are stored in register 32, data of registers 32a - 32d are read in parallel sequentially from the respective least significant bits. Such configuration can be implemented by shift registers capable of shifting in both row and column directions. Even by use of a shift register capable of shifting only in the row direction, if a register stage is selected by selector 33, the data rearranging operation can be realized.

[0065] A data bit shifting operation of shift register 32 is controlled by control circuit 34. Selector 33 reads data of 4 bits in parallel from shift register 32 under control by control circuit 34.

[0066] Four-bit data z<sub>kn</sub> are output from a terminal 501 in the configuration of Fig. 7A.

[0067] Fig. 8 shows configuration of product sum operation circuit 6. Referring to Fig. 8, product sum operation circuit 6 includes a partial sum generating circuit 41 for generating a partial sum in accordance with data from terminal 501, and an accumulator 42 for accumulating an output of partial sum generating circuit 41.

[0068] Partial sum generating circuit 41 includes an ROM (Read Only Memory) 43 for receiving 4-bit data  $z_{kn}$  and as an address signal. ROM 43 stores the partial sum shown in, for example, equation (10) in the form of table and, when supplied with 4-bit data  $z_{kn}$ , ROM 43 outputs a corresponding value. By constructing this partial sum generating circuit 41 in the form of the ROM table, a partial sum can be generated at a high speed without any multiplication.

[0069] Accumulator 42 includes an adder 44 for receiving a partial sum from partial sum generating circuit 41 at its one input, an accumulating register 45 for storing an output of adder 44, and a shifter 46 for shifting an output of register

45 by predetermined bits to apply the shifted output to the other input of adder 44. Output data yj is applied from shifter 46 to terminal 5. A description will now be made on an operation thereof, taking output data y2 as an example.

[0070] Four-bit data z<sub>kn</sub> are applied in turn from the least significant bit to product sum generating circuit 41. Product sum generating circuit 41 sequentially outputs a partial sum

$$\sum_{k=1}^{4} B_k \cdot z_{kn}$$

from ROM 43.

[0071] First, a partial sum

$$\sum_{k=1}^{4} B_k \cdot z_{k7}$$

is stored in register 45.

[0072] Then, a partial sum

$$\sum_{k=1}^{4} B_k \cdot z_{k6}$$

25

5

10

15

is output from partial sum generating circuit 41.

[0073] Shifter 46 shifts the contents of register 45 by one bit. Accordingly, an output of shifter 46 is shown as below:

30

$$\left[\sum_{k=1}^{4} B_{k} \cdot z_{k7}\right] 2^{-1}$$

5 [0074]

The output of adder 44 is shown as below:

$$\sum_{k=1}^{4} B_{k} \cdot z_{k6} + (\sum_{k=1}^{4} B_{k} \cdot zk_{7}) \cdot 2^{-1}$$

40

By sequentially repeating this operation, the following output (11) is stored in register 45.

$$\sum_{n=1}^{7} \left[ \sum_{k=1}^{4} B_{k} \cdot z_{kn} \right] \cdot 2^{-n+1}$$
 (11)

45

[0075] If data  $z_{k0}$  is applied, the contents of register 45 is the value shown in relation (8) since the data represented by the above relation (11) is shifted by one bit and then added by adder 44. After that, the shifting operation by shifter 46 is stopped and the contents of register 45 is read, whereby output data y2 is obtained.

[0076] The operation of partial sum generating circuit 41 and accumulator 42 is carried out by control circuit 47.

[0077] Fig. 9 shows another configuration of a product sum operation circuit. The product sum operation circuit shown in Fig. 9 is different from the configuration shown in Fig. 8 in that a partial sum generating circuit 41 includes two ROMs 43a and 43b and an adder 48 for adding outputs of ROMs 43a and 43b.

[0078] ROM 43a receives higher order bitS; while ROM 43b receives lower order bits. In this configuration, partial sums P and Q shown in the following equations are made by ROMs 43a and 43b.

$$P = \sum_{k=1}^{2} B_k \cdot z_{kn}$$

$$Q = \sum_{k=3}^{4} B_k \cdot z_{kn}$$

In the configuration of Fig. 9, the number of words to be stored into the ROMs is drastically reduced. This is because the number of words to be stored is determined by the number of bits of an address signal and increased in proportion to two's power of the bit number.

[0079] In the above configuration, product sum operation section 3 includes eight product sum operation circuits 6a - 6h. Intermediate data  $z_k$  and  $w_k$  are not calculated simultaneously. When intermediate data  $w_k$  are calculated after all intermediate data  $z_k$  are calculated and then output data y0, y2, y4 and y6 are calculated, product sum operation section 3 can be formed of four product sum operation circuits 6a - 6d as shown in Fig. 10.

[0080] Product sum operation circuits 6a - 6d calculate y0 and y1, y2 and y3, y4 and y5, and y6 and 7, respectively. The contents of an ROM for partial sum generation is changed in accordance with intermediate data  $z_k$  and  $w_k$ . If the ROM is structured in bank architecture, the change of the coefficient table can easily be realized.

[0081] A description will now be made on a structure for an IDCT operation with reference to Fig. 11. Referring to Fig. 11, an 8-point one-dimensional IDCT processor includes a data rearranging circuit 2 for rearranging data from an input terminal 4, a product sum operation section 3 for performing a production sum operation in accordance with an output of data rearranging circuit 2, and a postprocessing section 7 for carrying out addition and subtraction of a predetermined combination of outputs of product sum operation section 3.

[0082] Data rearranging circuit 2 and postprocessing section 7 are of the same configurations as those of rearranging circuit 2 and preprocessing section 1 of Fig. 4, respectively. An operation will now be described.

[0083] Input data  $y_i$  (i = 0, 1, ... 7) applied to terminal 4 undergoes an IDCT processing, so that output data  $x_i$  (i = 0, 1, ... 7) is transmitted to terminal 5. The relationship between data  $y_i$  and  $x_i$  is represented in the following matrix form (12).

where

5

$$A = \frac{1}{2}\cos\frac{\pi}{4}, B = \frac{1}{2}\cos\frac{\pi}{8}, C = \frac{1}{2}\sin\frac{\pi}{8}, D = \frac{1}{2}\cos\frac{\pi}{16}, E = \frac{1}{2}\cos\frac{3\pi}{16}, F = \frac{1}{2}\sin\frac{3\pi}{16}, G = \frac{1}{2}\sin\frac{\pi}{16}$$

This coefficient matrix is a transposed matrix of the coefficient matrix of equation (5). If the symmetry with respect to rows of the coefficient matrix of expression (12) is utilized, expression (12) is changed to the following equivalent expression (13).

$$\begin{bmatrix} x0 \\ x1 \\ x2 \\ x3 \end{bmatrix} = \begin{bmatrix} A & B & A & C \\ A & C & -A & -B \\ A & -C & -A & B \\ A & -B & A & -C \end{bmatrix} \begin{bmatrix} y0 \\ y2 \\ y4 \\ y6 \end{bmatrix} + \begin{bmatrix} D & E & F & G \\ E & -G & -D & -F \\ F & -D & G & E \\ G & -F & E & -D \end{bmatrix} \begin{bmatrix} y1 \\ y3 \\ y5 \\ y7 \end{bmatrix} \begin{bmatrix} x7 \\ x6 \\ x5 \\ x4 \end{bmatrix} = \begin{bmatrix} A & B & A & C \\ A & C & -A & -B \\ A & -C & -A & B \\ A & -C & -A & B \\ A & -B & A & -C \end{bmatrix} \begin{bmatrix} y0 \\ y2 \\ y4 \\ F & -D & G & E \\ G & -F & E & -D \end{bmatrix} \begin{bmatrix} y1 \\ y3 \\ y5 \\ y7 \end{bmatrix}$$
(13)

It should be noted that there are only two types of the coefficient matrix of expression (13). Assume that these two types are M and N. The processor shown in Fig. 11 carries out an IDCT operation in accordance with expression (13).

5

15

30

[0084] Data rearranging circuit 2 receives data yj (j = 0, 1, ... 7) from terminal 4 to rearrange data y0, y2, y4 and y6 and sequentially output the rearranged data from the least significant bit. That is, 4-bit data  $Y_{0n}$ ,  $y_{2n}$ ,  $y_{4n}$  and  $Y_{6n}$  (n = 0, 1, ... 7) of input data yj are output from data rearranging circuit 2, so that generation and accumulation of partial sums are carried out. As intermediate data, the following data is output:

$$\sum_{n=1}^{7} \left[ \sum_{k=1}^{4} B_{k}' \cdot y_{kn} \right] \cdot 2^{\cdot n} + \sum_{k=1}^{4} B_{k}' \cdot (-y_{ko})$$

This result corresponds to, for example, an intermediate term  $M2 = (A \cdot y0 - C \cdot y2 - A \cdot y4 + B \cdot y6)$  for x2. Then, data  $y_{1n}$ ,  $y_{3n}$ ,  $y_{5n}$  and  $y_{7n}$  (n = 0, 1, ... 7) are output from data rearranging circuit 2. The bit data is subjected to a product sum operation in product sum operation section 3. Accordingly, the remaining terms are obtained. For example, an intermediate term  $N2 = (F \cdot y1 - D \cdot y3 + G \cdot y6 + E \cdot y7)$  for x2 is obtained. Intermediate terms Mi (i = 0, 1, ... 7) and Ni (i = 0, 1, ... 7) are output in turn from product sum operation section 3. From expression (13), the following relations are satisfied: Mi =  $M_{7-i}$ , and Ni =  $N_{7-i}$ .

[0085] Data rearranging circuit 2 may alternately output data bits (y0, y2, y4, y6) and data bits (y1, y3, y5, y7). Each of product sum operation circuits 6a - 6d calculates Mi (=  $M_{7-i}$ ), and each of product sum operation circuits 6e-6h calculates Ni (=  $N_{7-i}$ ).

[0086] Postprocessing section 7 obtains a sum of and a difference between intermediate data Mi and Ni to generate output data xi and output the same to terminal 5. Accordingly, the following relation is obtained:

$$xi = Mi + Ni$$
 (i = 0, 1, 2, 3)  
 $xi = Mi - Ni$  (i = 4, 5, 6, 7)

[0087] Postprocessing section 7 has the same configuration as that of Fig. 5 or 6. In that case, input circuit 21 sequentially or alternately receives intermediate terms Mi (i = 0 to 3), Ni (i = 0 to 3) to apply a desired combination of the terms to adder/subtractors 22, 23 (or 26). The order in which data are selected in the input circuit is made by control circuit 25. In this case, data may be applied in the order of x0, x7, x1, x6, x2, x5, x3, x4 to output circuit 24, and output circuit 24 may output the data in the order of x0, x1, ... x7.

[0088] This one-dimensional IDCT processor can also be structured such that a single product sum operation circuit 6 calculates both intermediate terms Mi and Ni (i = 0 to 3).

[0089] The product sum operation in DCT processing and that in IDCT processing are identical to each other except for their coefficient matrixes. Accordingly, as shown in Fig. 12, a processor capable of selectively performing the DCT processing and the IDCT processing is obtained.

[0090] Referring to Fig. 12, the processor includes a preprocessing section 1, a data rearranging circuit 2, a product sum operation section 3, a postprocessing section 7 and a control circuit 8.

[0091] Preprocessing section 1 has its input connected to an input terminal 4 and its output connected to data rearranging circuit 2. Data rearranging circuit 2 has its output connected to product sum operation section 3. Product sum operation section 3 has its output connected to an input of postprocessing section 7. The output of postprocessing section 7 is supplied through an output terminal 5. Product sum operation section 3 includes first to eighth product sum operation circuits 6a - 6h.

[0092] Control circuit 8 switches DCT operation and IDCT operation and also controls the operation of the respective circuits.

[0093] A description will now be made on an operation of the processor shown in Fig. 12. In the case of DCT processing, data is allowed to go intactly through postprocessing section 7. This causes the processor of Fig. 12 to function equally to the DCT processor shown in Fig. 4. That is, data input from input terminal 4 undergoes addition/subtraction in preprocessing section 1 and then rearranged in data rearranging circuit 2. The rearranged data is then transmitted in turn from lower order bits to the product sum operation. The data subjected to a product sum operation

shown in, for example, expression (5) in the product sum operation section passes through postprocessing section 7 and is then directly output from output terminal 5.

[0094] In the case of inverse DCT processing, data passes intactly through preprocessing section 1, whereby the processor functions equally to the inverse DCT processor shown in Fig. 11 as follows. That is, the data input from input terminal 4 passes intactly through preprocessing section 1 and then rearranged in data rearranging circuit 2. The rearranged data is transmitted in turn from lower order bits to the product sum operation. The data subjected to the product sum operation in the product sum operation section 7 and then subjected to addition/subtraction for calculating output data. The added/subtracted data is output from output terminal 5.

[0095] Changes in coefficients in product sum operation section 3 are made by control circuit 1. This is easily realized by switching of banks of ROM or the like.

[0096] The above-described processor performs a one-dimensional DCT or IDCT operation. This processor can be developed to be able to perform a two-dimensional DCT or IDCT operation.

[0097] Fig. 13 shows configuration of a two-dimensional DCT processor according to the present invention. Referring to Fig. 13, the two-dimensional DCT processor includes a first one-dimensional DCT processing section 11a, a second one-dimensional DCT processing section 11b and a transposition circuit 12.

[0098] First, one-dimensional DCT processing section 11a carries out DCT processing with respect to rows, while second one-dimensional DCT processing section 11b carries out DCT processing with respect to columns. Transposition circuit 12 outputs in the order of columns the data applied in the order of rows. First and second processing sections 11a and 11b have the same configuration as that of the one-dimensional DCT processor shown in Fig. 1 and include a preprocessing section 1 (1a, 1b), a data rearranging circuit 2 (2a, 2b) and a product sum operation section 3 (3a, 3b). A description will now be made on a two-dimensional DCT processing of 8 x 8 points taken as an example.

[0099] If expression (3) is rewritten, the following expression (14) is obtained:

25

$$Yuv = \sum_{j=0}^{7} A(j,v) \cdot (\sum_{i=0}^{7} A(i,u) \cdot xij)$$
 (14)

[0100] Input terminal 4 is supplied with input data in the order of rows. That is, 8-term row vector data x(0, j), x(1, j) ... x(7, j) (j = 0, 1, ... 7) are applied in turn.

[0101] Preprocessing section 1a carries out preprocessing for the respective row vector data. For a zeroth row, for example, data (x00  $\pm$  x07), (x01  $\pm$  x06), (x02  $\pm$  x05) and (x03  $\pm$  x04) are output from preprocessing section 1a. Data rearranging circuit 2a rearranges four words (four addition data or four subtraction data) to output the rearranged data to product sum operation section 3a. Product sum operation section 3a applies a product sum operation to the applied data. The processing operation of data rearranging circuit 2a and product sum operation section 3a is the same as those of the processor shown in Fig. 4.

[0102] Accordingly, first one-dimensional DCT processing section 11a outputs in the order of rows 8-term row vector data Rk subjected to one-dimensional DCT processing with respect to a row direction. Rk is an 8-term row vector of Rk = (Rk0, Rk1, ... Rk7), where k = 0, 1, ... 7.

[0103] This intermediate data Rk is applied to transposition circuit 12 and stored therein in the order of rows. When 8-row intermediate data R0 - R7 are stored in transposition circuit 12, transposition circuit 12 outputs intermediate data in the order of columns to second one-dimensional DCT processing circuit 11b. The intermediate data stored in transposition circuit 12 is data which is subjected to an operation processing with respect to "i" in expression (14). Transposition circuit 12 outputs intermediate data in the order of columns. In the zeroth column, for example, data R00, R10, R20, ... R70 are read in turn.

[0104] Second one-dimensional DCT processing section 11b carries out the same preprocessing, the same data rearranging processing and the same product sum operation processing for each column as those of first one-dimensional DCT processing section 11a. Accordingly, second one-dimensional DCT processing section 11b outputs data subjected to addition with respect to "j" in expression (14). That is, 8-term column vector data are output in the order of columns from output terminal 5. The data appearing on output terminal 5 are data subjected to one-dimensional DCT processing in both row and column directions, i.e., two-dimensional DCT processing.

[0105] Like the transposition in the circuit shown in Fig. 2, transformation from rows to columns in transposition circuit 12 is realized by changing an address of a buffer memory in the row direction in data writing and in the column direction in data reading.

[0106] Also, the two-dimensional IDCT processing can be realized by expanding the one-dimensional IDCT processor shown in Fig. 11. Fig. 14 shows configuration of a two-dimensional IDCT processor.

[0107] Referring to Fig. 14, the two-dimensional IDCT processor includes a first one-dimensional IDCT processor 13a and a second one-dimensional IDCT processor 13b.

- [0108] First, one-dimensional IDCT processor 13a includes a data rearranging circuit 2a, a product sum operation section 3a and a postprocessing section 7a. Second one-dimensional IDCT processor 13b includes a data rearranging circuit 2b, a product sum operation section 3b and a postprocessing section 7b. Both first and second one-dimensional IDCT processors 13b and 13b carry out the same processing as that of the one-dimensional IDCT processor shown in Fig. 11.
- [0109] Input terminal 4 is supplied with input data in the order of rows. First IDCT processor 13a carries out an IDCT processing with respect to rows.
- [0110] Transposition circuit 12 sequentially stores therein intermediate data applied in the order of rows from first IDCT processor 13a and outputs the stored intermediate data in the order of columns.
- 0 [0111] Second IDCT processor 13b carries out an IDCT processing for the respective columns. Accordingly, output terminal 5 is supplied with the data subjected to the IDCT processing in both row and column directions, i.e., two-dimensional IDCT processing, in the order of columns.
- [0112] Fig. 15 shows configuration of a two-dimensional DCT/IDCT processor being still another embodiment of the present invention. The processor of Fig. 15 includes a first one-dimensional DCT/IDCT processor 14a, a second one-dimensional DCT/IDCT processor 14b, and a transposition circuit 12 provided between processors 14a and 14b.
  - [0113] First and second processors 14a and 14b are of the same configuration as that of the processor shown in Fig. 12 and include a preprocessing section 1 (1a, 1b), a data rearranging circuit 2 (2a, 2b), a product sum operation section 3 (3a, 3b) and a postprocessing section 7 (7a, 7b).
- [0114] In the configuration of Fig. 15, like the configuration shown in Fig. 12, if preprocessing sections 1a and 1b and postprocessing sections 7a and 7b are selectively set in a through state and coefficients (used in the partial sum generation circuit) of product sum operation sections 3a and 3b are selected, then two-dimensional DCT and IDCT processings can selectively be carried out.
  - [0115] An operation of the processor of Fig. 15 is identical to those of the processors of Figs. 13 and 14. One processing mode of the DCT processing and the IDCT processing is set by a control circuit not shown (corresponding to control circuit 8 of Fig. 12).
  - [0116] Although the foregoing description has not been concerned with implementation forms of the DCT processors, the use of the above-described configuration makes it possible to easily incorporate all of DCT (inverse DCT) functions integrally on a semiconductor integrated circuit.
- [0117] It is also possible to incorporate all of the above-described DCT/inverse DCT functions integrally on a semiconductor integrated circuit and simultaneously incorporate functional circuitry having functions other than the DCT/inverse DCT functions integrally on one semiconductor substrate. Fig. 16 shows an example of use of a DCT processor which is incorporated integrally on one semiconductor substrate simultaneously with other functional circuitry.
  - [0118] Referring to Fig. 16, a semiconductor integrated circuit (semiconductor chip) 50 includes a DCT processor 51 and functional circuits 52, 53 and 54.
- [0119] DCT processor 51 has such configuration as shown in Fig. 13 or 14. Functional circuits 52, 53 and 54 have different functions A, B and C, respectively. In application to video data processing, functions A, B and C include such functions necessary for image compression as quantization, variable length coding (entropy coding) and the like. The functions necessary for image compression are standardized by, for example, JPEG (Joint of Photographic Expert Group).
- [0120] In the configuration shown in Fig. 16, DCT processor 51 is used in cooperation with (or in clock synchronization with) functional circuits 52, 53 and 54.
  - [0121] In the embodiment shown in Fig. 16, the functional circuits integrated together with the DCT processor are dedicated circuits having specific functions. The functional circuits are not limited to such dedicated circuits and may be integrated together with a microprocessor or a programmable functional block 56 such as a DSP (Digital Signal Processor) as shown in, for example, Fig. 17. Further, the DCT processor may be integrated together with a dedicated functional circuit 55 and programmable functional block 56 in combination as shown in Fig. 17.
  - [0122] The summary of principal technical effects of the present invention is as follows:

50

55

- (i) Since the required number of times of multiplication is reduced by preprocessing in DCT processing or by postprocessing in IDCT processing, load on a product sum operation circuit is reduced.
  - (ii) Since a product sum operation is carried out by a memory and an adder, the scale of circuitry is substantively reduced.
  - (iii) Because of the above item (ii), a parallel multiplication circuit is unnecessary. Accordingly, when the entire processor performs a synchronizing operation, a higher operation speed on a worst delay path is easily achieved, facilitating a faster processing.
  - (iv) Since the effect of the above item (iii) facilitates an upgrading of a DCT (or IDCT) processor, this effect is greatly advantageous particularly in implementation of the present DCT (or IDCT) processor on a semiconductor integrated circuit, together with the effect of reducing the circuit scale.

### Claims

5

10

15

20

30

40

50

55

 A processor having at least a function of carrying out one-dimensional discrete cosine transform DCT and onedimensional inverse discrete cosine transform IDCT of N-term input data X, wherein said N is a positive integer, said processor comprising:

preprocessing means (1; 1a, 1b) for carrying out addition and subtraction for each of sets of predetermined two terms of said input data X to generate a first set of addition data  $(z_k)$  and a second set of subtraction data  $(w_k)$  subject to DCT processing; and

matrix product means (2, 3; 2a, 3a; 2b, 3b) for obtaining a first matrix product of said first set of data (z<sub>k</sub>) from said preprocessing means (1, 1a, 1b) and a predetermined first coefficient matrix (B), and a second matrix product of said second set of data (w<sub>k</sub>) and a predetermined second coefficient matrix (B), wherein an output of said matrix product means provides N-term output data subjected to DCT processing,

said matrix product means including table memory means (43) storing partial product sum data for respective outputs thereof at addresses provided by data received from said preprocessing means, and

accumulation means (42) for summing data received from said table memory means (43), said matrix product means performing N/2 point product sum operations,

a postprocessing section (7; 7a, 7b) for receiving an output of said matrix product means to carry out addition and subtraction of predetermined 2-term data of the received N-term data and generate first and second sets (X2i, X2i + 1) of output data Yi, wherein the output of said postprocessing means providing N-term output data subject to IDCT processing, and

control means (8) for selectively enabling one of said preprocessing section (1; 1a, 1b) and said postprocessing section (7; 7a, 7b).

### 25 2. The processor of claim 1, wherein

said preprocessing means (1) includes:

set generating means (21) for generating a set of pth term data X(p) and qth term data X(q) of said input data  $X_1$  where p + q = N-1,  $0 \le p < q \le N-1$ , and p and q are an integer;

addition means (22; 26) for carrying out addition of 2-term data output from said set generating means (21); and

subtraction means (23, 26) for carrying out subtraction of the 2-term data output from said set generating means (21).

# 35 3. The processor of claim 1 or 2, wherein

said matrix product means (2, 3) includes

storage means (32) for sequentially receiving said first set of data from said preprocessing means (1) to store the received data therein, each of said first set of data having a plurality of bits, and

parallel reading means (33) for reading, in parallel and in order, one-bit data in-the same bit figure of all of said first set of data stored in said storage means (32).

4. The processor of one of claims 1 to 3, wherein

said matrix product means (2, 3) includes

storage means (32) for sequentially receiving said second set of data from said preprocessing means (1) to store the received data therein, each of said second set of data having a plurality of bits, and parallel reading means (33) for reading in parallel and in order, one-bit data in the same bit figure of all of said second set of data stored in said storage means (32).

# 5. The processor of claim 3 or 4, wherein

said matrix product means (2, 3) further includes

a plurality of first product sum operation means (6a - 6d), and wherein each said product sum operation means includes said table memory means (43) receiving parallel bit data from said parallel reading means (33) as an address signal to output a corresponding partial sum, said table memory means (43) storing in advance the product sum of a corresponding coefficient and said parallel bit data in a table form, and said accumulation means (42) accumulating outputs of said table memory means (43), said accumulation

means (42)

5

10

20

25

40

45

providing a first set of output data of said N-term output data.

6. The processor of one of claims 1 to 5, wherein

said matrix product means (2, 3) further includes

a plurality of second product sum operation means (6e - 6h), and wherein

each said second product sum operation means includes said table memory means (43) for receiving parallel bit data from said parallel reading means (33) as an address signal to output a corresponding partial sum, said table memory means (43) storing in advance a product sum of a corresponding coefficient and said parallel bit data in a table form, and

said accumulation means (42) accumulating outputs of said table memory means (43) to generate a second set of data of said N-term output data.

15 7. The processor of claim 5 or 6, wherein

said accumulation means (42) includes

addition means (44) for receiving an output of said table memory means (43) at its one input,

register means (45) for temporarily storing an output of said addition means (44), and

shift means (46) for shifting storage data in said register means (45) by a bit to apply the shifted data to the other input of said addition means (44), a final output of said shift means providing said first or second set of data of said N-term output data.

8. The processor of one of daims 1 to 7, wherein

said postprocessing section (7) includes means (22, 23; 26) for carrying out addition and subtraction of (2i)th-

term data Y (2i) and (2i + 1)th-term data Y (2i + 1) of N-term output data Y of said matrix product means (2, 3) wherein said i is an integer of  $0 \le i \le N/2 - 1$ .

9. The processor of claim 8, wherein

the addition of said data Y (2i) and y (2i + 1) indicates (i)th-term output data Z (i), and the subtraction of said data Y (2i) and Y (2i + 1) indicates (N - i - 1)th-term output data Z (N - i - 1).

10. The processor of one of claims 1 to 9, further comprising:

transposition means (12) for sequentially receiving output data (Rk) of said matrix product means (2, 3) to store the received data therein, transpose a matrix formed by the stored data and sequentially output N-term intermediate data:

second preprocessing means (1b) having the same configuration as that of said preprocessing means (1a), for receiving an output of said transposition means (12) to carry out addition and subtraction for each of predetermined 2-term sets of said H-term intermediate data; and

second matrix product means (2b, 3b) having the same configuration as that of said matrix product means (2a, 3a), for performing a product operation of output data of said second preprocessing means (1b) and a predetermined second coefficient matrix, an output of said second matrix product means (2b, 3b) indicating data subjected to two-dimensional DCT processing.

11. The processor of claim 12, further comprising:

second postprocessing means (7b) having the same configuration as that of said postprocessing means (7a), for receiving an output of said second matrix product means (2b, 3b); and second control means (8) for enabling one of said second preprocessing means (1b) and said second post-

processing means (7b).

12. A processor having at least a function of carrying out one-dimensional inverse discrete cosine transform (IDCT) of N-term input data Y, wherein said N is a positive integer, said processor comprising:

matrix product means (2, 3; 2a, 3a, 2b, 3b) for dividing said N-term input data Y into a first set of input data and

a second set of input data and carrying out an N/2 point product operation of said first set of input data and a first coefficient matrix (B') and an N/2 point product operation of said second set of input data and a second coefficient matrix (B'), to generate a first set of intermediate data Mi and a second set of intermediate data Ni, wherein said i is an integer of  $0 \le i \le N/2 - 1$ ,

said matrix product means including table memory means (43) storing partial product sum data for respective intermediate data at addresses provided by the input data, and accumulation means (42) for summing data received from said table memory means (43); and

postprocessing means (7; 7a, 7b) for carrying out addition and subtraction of two intermediate data in a predetermined relationship in said first set of intermediate data Mi and said second set of intermediate data Ni from said matrix product means to generate first and second sets of output data Xi.

### 13. The processor of claim 12, wherein

5

10

15

20

25

30

35

40

45

50

said postprocessing means (7; 7a, 7b) includes means (22, 23; 26) for carrying out addition and subtraction of said first set of (i)th-term intermediate data Mi and said second set of (i)th-term intermediate data Ni; and addition data (Mi + Ni) indicates (i)th-term data of N-term output data, and subtraction data (Mi - Ni) indicates (N - i - 1)th-term data of said N-term output data.

# 14. The processor of claim 12 or 13, wherein

each said intermediate data is represented by a plurality of bits, and said matrix product means (2, 3) includes

storage means (32) for dividing said N-term input data Y into a first set of input data Y (2i) and a second set of input data Y (2i + 1) to store each set of the input data therein.

first reading means (33) for reading in parallel one-bit data in the same order of said first set of all input data Y (2i) from said storage means (32),

second reading means (33) for reading in parallel one-bit data in the same bit figure of said second set of all input data Y (2i + 1) from said storage means (32),

first product sum operation means (6a - 6d) for carrying out a product sum operation of parallel bit data from said first reading means (33) and a corresponding coefficient of said first coefficient matrix, to generate said first set of output data Xi, and

second product sum operation means (6e - 6h) for carrying out a product sum operation of parallel bit data from said second reading means (33) and a corresponding coefficient of said second coefficient matrix, to generate said second set of output data Xi.

# 15. The processor of claim 14, wherein

said first and second product sum operation means include a plurality of operation circuits each related to one term of said output data Xi, each said operation circuits (6a - 6h) including

said table memory means (43) receiving parallel bit data as an address signal to output the result of the product sum operation with the corresponding coefficient,

said table memory means (43) storing in advance data indicating the result of the product sum operation in a table form, and

said accumulation means (42) accumulating outputs of said table memory means (43).

# 16. The processor of claim 15, wherein

said accumulation means (42) includes

2-input addition means (44) for receiving an output of said table memory means (43) at its one input, register means (45) for temporarily storing an output of said addition means, and shift means (46) for shifting storage data in said register means (45) by a bit to apply the shifted data

shift means (46) for shifting storage data in said register means (45) by a bit to apply the shifted data to the other input of said addition means (44), a final output of said shift means (46) indicating output data of an associated term.

# 17. The processor of one of claims 12 to 16 further comprising

preprocessing means (1; 1a, 1b) for carrying out addition and subtraction of a predetermined set of 2-term data Y(j), Y(N-j-1) of said N-term input data Y(j) to generate a first set of addition data and a second set of sub-

traction data, said first set of said addition data and said second set of said subtraction data being applied as said first and second sets of input data to said matrix product means (2, 3); and

control means (8) for enabling one of said preprocessing means (1) and said postprocessing means (7).

18. The processor of one of claims 12 to 17, further comprising:

transposition means (12) for sequentially receiving N-term N output data from said postprocessing means (7a) to store the received data therein, then transpose the stored data and output the transposed data; second matrix product means (2b, 3b) of the same configuration as that of said matrix product means (2a, 3a),

for receiving an output of said transposition means (12); and

second postprocessing means (7b) of the same configuration as that of said postprocessing means (7a), for receiving an output of said second matrix product means (2b, 3b),

an output of said second postprocessing means (7b) indicating data subjected to two-dimensional IDCT processing of N by N points.

19. The processor of one of claims 1 to 18, wherein

said N is 8.

10

30

35

40

45

50

55

20. The processor of one of claims 1 to 19, wherein

said processor (51) is incorporated integratedly in an integrated circuit (50) so as to operate in cooperation with other functional circuitry (52, 53, 54; 55, 56).

21. A method of processing one-dimensional discrete cosine transform DCT or one-dimensional inverse discrete cosine transform IDCT of N points, X, wherein said N is 2<sup>m</sup>, said m being a natural number, said method comprising the steps of:

> 1st step of receiving N point input data X to divide the received input data X into a first set of N/2 data and a second set of N/2 data, each set including data in a predetermined relationship in said N point input data X; 2nd step of carrying out addition and subtraction of each 2-term data in a predetermined relationship to the first set and the second set in said input data X, to generate a first set of addition data  $(z_k)$  and a second set of subtraction data (wk), said first and second sets including N/2-term data subject to DCT processing;

> 3rd step of carrying out an N/2 point product operation of said first set of addition data (zk) and a first coefficient matrix (B) to generate a first set of output data;

> 4th step of carrying out an N/2 point product operation of said second set of subtraction data (wk) and a second coefficient matrix (B) to generate a second set of output data, said step of generating said first and second sets of output data includes the steps of generating a corresponding partial sum by reference to table memory means (43), using an applied data as an address signal, and of summing the partial sums;

> 5th step of outputting said first set of output data and said second set of output data subjected to DCT processing in a predetermined order;

> 6th step of carrying out an N/2 point product operation of said first set of input data and a third coefficient matrix (B') to generate a first set of intermediate data M(i);

> 7th step of carrying out an N/2 point product operation of said second set of input data and a fourth coefficient matrix (B') to generate a second set of intermediate data N(i);

> 8th step of carrying out addition and subtraction of said first set of intermediate data M(i) and said second set of intermediate data N(i) to generate a set of addition data and a set of subtraction data; and

> 9th step of outputting said set of said addition data and said set of said subtraction data subjected to IDCT processing in a predetermined order,

> said step of generating said intermediate data M(i) and N(i) includes the step of generating a corresponding partial sum by reference to table memory means (43), using the applied data as an address, and summing up the partial sums for each respective intermediate data, and

> selectively enabling one of operational processing of the second through fifth steps and the sixth through ninth steps.

22. The method of claim 21, wherein

said 2-term data in said predetermined relationship are (i)th-term data x (i) and (N - i -1)th-term data x (N - i -

17

- 1), wherein said i is an integer of  $0 \le i \le N/2 \cdot 1$ .
- 23. A method of carrying out one-dimensional inverse discrete cosine transform (IDCT) of N points, wherein said N is 2<sup>m</sup>, said m being a natural number, said method comprising the steps of:

receiving N-term input data Y to generate a first set of input data of even-term data Y(2i) and a second set of input data of odd-term input data Y(2i + 1), wherein said i is an integer of  $0 \le 1 \le N/2 - 1$ ;

carrying out an N/2 term product operation of said first set of input data and a first coefficient matrix (B') to generate a first set of intermediate data M(i);

carrying out an N/2 term product operation of said second set of input data and a second coefficient matrix (B') to generate a second set of intermediate data N(i);

carrying out addition and subtraction of said first set of intermediate data M(i) and said second set of intermediate data N(i) to generate a first set of addition data and a second set of subtraction data; and

outputting said first set of said addition data and said second set of said subtraction data in a preditermined order, wherein

said step of generating said intermediate data M(i) and N(i) includes the step of generating a corresponding partial sum by reference to tabel memory means (43), using the applied data as an address, and summing up the partial sums for each respective intermediate data.

20 24. The method of claim 23, wherein

said addition data is a sum of data M (i) and data N (i), said addition data of M (i) + N (i) indicating ith-term data X (i) of N-term output data; and

said subtraction data is a difference between data M (i) and data N (i), said subtraction data of M (i) - N (i) providing (N - i - 1)th-term output data X (N - i - 1).

# **Patentansprüche**

5

10

15

25

30

35

40

45

50

 Prozessor mit mindestens einer Funktion des Ausführens einer eindimensionalen diskreten Cosinus-Transformation DCT und einer eindimensionalen inversen diskreten Cosinus-Transformation IDCT von Eingangsdaten X mit N Termen, worin das N eine positive ganze Zahl ist, wobei der Prozessor aufweist:

ein Vorverarbeitungsmittel (1; 1a, 1b) zum Ausführen von Addition und Subtraktion für jede von Mengen von vorbestimmten zwei Termen der Eingangsdaten X zum Erzeugen einer ersten Menge von Additionsdaten ( $z_k$ ) und einer zweiten Menge von Subtraktionsarten ( $w_k$ ), die der DCT-Verarbeitung unterworfen werden; und ein Matrixproduktmittel (2, 3; 2a, 3a, 2b, 3b) zum Erzielen eines ersten Matrixproduktes der ersten Menge von Daten ( $z_k$ ) von dem Verarbeitungsmittel (1, 1a, 1b) und einer vorbestimmten ersten Koeffizientenmatrix (B) und eines zweiten Matrixproduktes der zweiten Menge von Daten ( $w_k$ ) und einer vorbestimmten zweiten Koeffizientenmatrix (B),

worin ein Ausgang des Matrixproduktmittels Ausgangsdaten mit N Termen vorsieht, die der DCT-Verarbeitung unterworfen werden, das Matrixproduktmittel ein Tabellenspeichermittel (43) aufweist, das darin Teilproduktsummendaten speichert für entsprechende Ausgaben davon an Adressen, die durch Daten vorgesehen sind, die von dem Vorverarbeitungsmittel empfangen sind, und ein Akkumulationsmittel (42) zum Summieren von Daten, die von dem Tabellenspeichermittel (43) empfangen sind, wobei das Matrixproduktmittel N/2 Punktproduktsummenoperationen ausführt,

einen Nachverarbeitungsabschnitt (7; 7a, 7b) zum Empfangen einer Ausgabe des Matrixproduktmittels zum Ausführen von Addition und Subtraktion von vorbestimmten Daten mit zwei Termen der empfangenen Daten mit N Termen und zum Erzeugen einer ersten und einer zweiten Menge (X2i, X2i+1) von Ausgangsdaten Yi, wobei der Ausgang des Nachverarbeitungsmittels Ausgangsdaten mit N Termen vorsieht, die der IDCT-Verarbeitung unterworfen werden, und ein Steuermittel (8) zum selektiven Freigeben eines des Vorverarbeitungsabschnittes (1; 1a, 1b) und des Nachverarbeitungsabschnittes (7; 7a, 7b).

2. Prozessor nach Anspruch 1, bei dem das Vorverarbeitungsmittel (1) aufweist:

ein Mengenerzeugermittel (21) zum Erzeugen einer Menge von Daten X(p) des p-ten Termes und Daten (X(q)) des q-ten Termes der Eingangsdaten (X), wobei p+q = N-1, 0 ≤ p < q ≤ N-1 ist und p und q ganze Zahlen sind; ein Additionsmittel (22; 26) zum Ausführen der Addition der Daten mit zwei Termen, die von dem Mengenerzeugermittel (21) ausgegeben sind; und

ein Subtraktionsmittel (23, 26) zum Ausführen der Subtraktion der Daten mit zwei Termen, die von dem Mengenerzeugermittel (21) ausgegeben sind.

3. Prozessor nach Anspruch 1 oder 2, bei dem das Matrixproduktmittel (2, 3) aufweist

5

10

15

25

30

35

40

45

55

ein Speichermittel (32) zum sequentiellen Empfangen der ersten Mengen von Daten von dem Vorverarbeitungsmittel (1) zum Speichern der empfangenen Daten darin, wobei jede der ersten Menge von Daten eine Mehrzahl von Bit aufweist, und

ein Parallellesemittel (23) zum Lesen auf parallele Weise und in der Reihenfolge von 1Bit-Daten in der gleichen Bitanordnung aller der ersten Menge von Daten, die in dem Speichermittel (32) gespeichert sind.

4. Prozessor nach einem der Ansprüche 1 bis 3, bei dem das Matrixproduktmittel (2, 3) aufweist

ein Speichermittel (32) zum sequentiellen Empfangen der zweiten Menge von Daten von dem Vorverarbeitungsmittel (1) zum Speichern der empfangenen Daten darin, wobei der zweiten Menge von Daten eine Mehrzahl von Bit aufweist, und

ein Parallellesemittel (33) zum Lesen auf parallele Weise und in der Reihenfolge von 1Bit-Daten in der gleichen Bitanordnung aller der zweiten Menge von Daten, die in dem Speichermittel (32) gespeichert sind.

20 5. Prozessor nach Anspruch 3 oder 4, bei dem das Matrixproduktmittel (2, 3) weiter aufweist

eine Mehrzahl von ersten Produktsummenoperationsmitteln (6a-6d), und bei dem jedes der Produktsummenoperationsmittel das Tabellenspeichermittel (43) enthält, das parallele Bitdaten von dem Parallellesemittel (33) als ein Adressensignal zum Ausgeben einer entsprechenden Teilsumme empfängt, wobei das Tabellenspeichermittel (43) zuvor die Produktsumme eines entsprechenden Koeffizienten und der parallelen Bitdaten in Tabellenform speichert, und

ein Akkumulationsmittel (42), das die Ausgaben des Tabellenspeichermittels (43) akkumuliert, wobei das Akkumulationsmittel (42) eine erste Menge von Ausgangsdaten der Ausgangsdatenmittel mit N Termen vorsieht.

6. Prozessor nach einem der Ansprüche 1 bis 5, bei dem das Matrixproduktmittel (2, 3) weiter aufweist

eine Mehrzahl von zweiten Produktsummenoperationsmitteln (6e-6h), und worin jedes zweite Produktsummenoperationsmittel das Tabellenspeichermittel (43) zum Empfangen paralleler Bitdaten von dem Parallellesemittel (33) als ein Adressensignal zum Ausgeben einer entsprechenden Teilsumme aufweist, wobei das Tabellenspeichermittel (43) zuvor eine Produktsumme eines entsprechenden Koeffizienten und der parallelen Bitdaten in einer Tabellenform speichert, und das Akkumulationsmittel (42) die Ausgaben des Tabellenspeichermittels (43) akkumuliert zum Erzeugen einer zweiten Menge von Daten der Ausgangsdaten mit N Termen.

7. Prozessor nach Anspruch 5 oder 6, bei dem das Akkumulationsmittel (42) aufweist

ein Additionsmittel (44) zum Empfangen einer Ausgabe des Tabellenspeichermittels (43) an seinem einen Eingang.

ein Registermittel (45) zum zeitweiligen Speichern einer Ausgabe des Additionsmittels (44) und ein Schiebemittel (46) zum Schieben von Speicherdaten in dem Registermittel (45) um ein Bit zum Anlegen der geschobenen Daten an den anderen Eingang des Additionsmittels (44), wobei eine letzte Ausgabe des Schiebemittels die erste oder zweite Menge von Daten der Ausgangsdaten mit N Termen vorsieht.

0 8. Prozessor nach einem der Ansprüche 7,

bei dem der Nachbearbeitungsabschnitt (7) ein Mittel (22, 23; 26) zum Ausführen von Addition und Subtraktion der Daten Y(2i) des (2i)ten Termes und Daten Y(2i+1) des (2i+1)ten Termes der Ausgangsdaten Y mit N Termen des Matrixproduktmittels (2, 3) aufweist, worin das i eine ganze Zahl von  $0 \le i \le N/2 - 1$  ist.

9. Prozessor nach Anspruch 8,

bei dem die Addition der Daten Y(2i) und Y(2i+1) Ausgangsdaten Z(i) des i-ten Termes bezeichnet und die

Subtraktion der Daten Y(2i) und Y(2i+1) die Ausgangsdaten Z(n-i-1) des (N-i-1)ten Termes bezeichnet.

10. Prozessor nach einem der Ansprüche 1 bis 9 weiter mit:

einem Transponierungsmittel (12) zum sequentiellen Empfangen von Ausgangsdaten (Rk) des Matrixproduktmittels (2, 3) zum Speichern der empfangenen Daten darin, Transponieren einer Matrix, die durch die gespeicherten Daten gebildet ist, und sequentiellen Ausgeben von Zwischendaten mit N Termen;
einem zweiten Vorverarbeitungsmittel (1b) mit dem gleichen Aufbau wie der des Vorverarbeitungsmittels (1a)
zum Empfangen einer Ausgabe des Transponierungsmittels (12) zum Ausführen von Addition und Subtraktion
für jede der vorbestimmten Mengen mit zwei Termen der Zwischendaten mit N Termen; und
einem zweites Matrixproduktmittel (2b, 3b) mit dem gleichen Aufbau wie der des Matrixproduktmittels (2a, 3a)
zum Ausführen einer Produktoperation der Ausgangsdaten des zweiten Vorverarbeitungsmittels (1b) und einer
vorbestimmten zweiten Koeffizientenmatrix, wobei eine Ausgabe des zweiten Matrixproduktmittels (2b, 3b)
Daten bezeichnet, die einer zweidimensionalen DCT-Verarbeitung unterworfen werden.

15

20

25

30

35

40

45

55

10

5

11. Prozessor nach Anspruch 10, weiter mit:

einem zweiten Nachverarbeitungsmittel (7b) mit dem gleichen Aufbau wie der des Nachbearbeitungsmittels (7a) zum Empfangen einer Ausgabe des zweiten Matrixproduktmittels (2b, 3b); und ein zweites Steuermittel (8) zum Freigeben von einem des zweiten Vorverarbeitungsmittels (1b) und des zweiten Nachverarbeitungsmittels (7b).

12. Prozessor mit mindestens einer Funktion des Ausführens einer eindimensionalen inversen diskreten Cosinus-Transformation (IDCT) von Eingangsdaten Y mit N Termen, worin das N eine positive ganze Zahl ist, wobei der Kompressor aufweist:

ein Matrixproduktmittel (2, 3; 2a, 3a, 2b, 3b) zum Unterteilen der Eingangsdaten Y mit N Termen in eine erste Menge von Eingangsdaten und eine zweite Menge von Eingangsdaten und zum Ausführen einer N/2 Punktproduktoperation der ersten Menge von Eingangsdaten und einer ersten Koeffizientenmatrix (B') und einer N/2 Punktproduktoperation der zweiten Menge von Eingangsdaten und einer zweiten Koeffizientenmatrix (B') zum Erzeugen einer ersten Menge von Zwischendaten Mi und einer zweiten Menge von Zwischendaten Ni, worin das i eine ganze Zahl von 0 ≤ i ≤ N/2-1 ist,

wobei das Matrixproduktmittel ein Tabellenspeichermittel (43), das Teilproduktsummendaten für entsprechende Zwischendaten an Adressen speichert, die durch die Eingangsdaten vorgesehen wird, und ein Akkumulationsmittel (42) zum Summieren von Daten, die von dem Tabellenspeichermittel (43) empfangen sind, aufweist; und

ein Nachverarbeitungsmittel (7, 7a, 7b) zum Ausführen von Addition und Subtraktion der zwei Zwischendaten in einer vorbestimmten Beziehung in der ersten Menge von Zwischendaten Mi und der zweiten Menge von Zwischendaten Ni von dem Matrixproduktmittel zum Erzeugen einer ersten und einer zweiten Menge von Ausgangsdaten Xi.

13. Prozessor nach Anspruch 12,

bei dem das Nachverarbeitungsmittel (7, 7a, 7b) ein Mittel (22, 23; 26) zum Ausführen von Addition und Subtraktion der ersten Menge von Zwischendaten Mi des (i)ten Termes und der zweiten Menge von Zwischendaten Ni des (i)ten Termes aufweist; und Additionsdaten (Mi+Ni) Daten des (i)ten Termes der Ausgangsdaten mit N Termen bezeichnen und Subtraktionsdaten (Mi-Ni) Daten des (N-i-1)ten Termes der Ausgangsdaten mit N Termen bezeichnen.

14. Prozessor nach Anspruch 12 oder 13,

bei dem jede der Zwischendaten durch eine Mehrzahl von Bit dargestellt werden und das Matrixoroduktmittel (2, 3) aufweist

ein Speichermittel (32) zum Unterteilen der Eingangsdaten Y mit N Termen in eine erste Menge von Eingangsdaten Y(2i) und eine zweite Menge von Eingangsdaten Y(2i+1) zum Speichern jeder der Menge der Eingangsdaten darin.

ein erstes Lesemittel (33) zum Lesen auf parallele Weise von 1Bit-Daten in der gleichen Reihenfolge die erste Menge aller Eingangsdaten Y(2i) aus dem Speichermittel (32),

ein zweites Lesemittel (33) zum Lesen auf parallele Weise von 1Bit-Daten in der gleichen Bitanordnung der zweiten Menge von allen Eingangsdaten Y(2i+1) aus dem Speichermittel (32),

ein erstes Produktsummenoperationsmittel (6a-6d) zum Ausführen einer Produktsummenoperation von parallelen Bitdaten von dem ersten Lesemittel (33) und einem entsprechenden Koeffizienten der ersten Koeffizientenmatrix zum Erzeugen der ersten Menge von Ausgangsdaten (Xi), und

ein zweites Produktsummenoperationsmittel (6e-6h) zum Ausführen einer Produktsummenoperation paralleler Bitdaten von dem zweiten Lesemittel (33) und einem entsprechenden Koeffizienten der zweiten Koeffizientenmatrix zum Erzeugen der zweiten Menge von Ausgangsdaten (Xi).

### 10 15. Prozessor nach Anspruch 14,

bei dem das erste und das zweite Produktsummenoperationsmittel eine Mehrzahl von Operationsschaltungen enthalten, von denen sich jede auf einen Term der Ausgangsdaten Xi bezieht, wobei jede Operationsschaltung (6a-6h) aufweist

das Tabellenspeichermittel (43), das parallel Bitdaten als ein Adressensignal empfängt zum Ausgeben des Resultates der Produktsummenoperation mit dem entsprechenden Koeffizienten,

wobei das Tabellenspeichermittel (43) zuvor Daten, die das Resultat der Produktsummenoperation bezeichnen, in einer Tabellenform speichert, und

das Akkumulationsmittel (42) die Ausgaben des Tabellenspeichermittels (43) akkumuliert.

20

25

35

45

50

15

5

16. Prozessor nach Anspruch 15, bei dem das Akkumulationsmittel (42) aufweist

ein Zweieingangsadditionsmittel (44) zum Empfangen einer Ausgabe des Tabellenspeichermittels (43) an seinem einen Eingang,

ein Registermittel (45) zum zeitweiligen Speichern einer Ausgabe des Additionsmittels und ein Schiebemittel (46) zum Schieben von Speicherdaten in dem Registermittel (45) um 1 Bit zum Anlegen der verschobenen Daten an den anderen Eingang des Additionsmittels (44), wobei eine letzte Ausgabe des Schiebemittels (46) Ausgangsdaten eines zugehörigen Termes bezeichnet.

77. Prozessor nach einem der Ansprüche 12 bis 16 weiter mit:

einem Vorverarbeitungsmittel (1; 1a, 1b) zum Ausführen von Addition und Subtraktion einer vorbestimmten Menge von Daten Y(j) Y(N-j-1) mit zwei Termen und der Eingangsdaten Y mit N Termen zum Erzeugen einer ersten Menge von Additionsdaten und einer zweiten Menge von Subtraktionsdaten, wobei die erste Menge von Additionsdaten und die zweite Menge von Subtraktionsdaten als die erste und die zweite Menge von Eingangsdaten an das Matrixproduktmittel (2, 3) angelegt werden; und ein Steuermittel (8) zum Freigeben von einem des Vorverarbeitungsmittels (1) und des Nachbearbeitungsmittels (7).

40 18. Prozessor nach einem der Ansprüche 12 bis 17 weiter mit:

einem Transponierungsmittel (12) zum sequentiellen Empfangen von N Ausgangsdaten mit N Termen von dem Nachverarbeitungsmittel (7a) zum Speichern der empfangenen Daten darin, dann Transponieren der gespeicherten Daten und Ausgeben der transponierten Daten;

einem zweites Matrixproduktmittel (2b, 3b) des gleichen Aufbaus wie der des Matrixproduktmittels (2a, 3a) zum Empfangen einer Ausgabe des Transponierungsmittels (12); und

einem zweites Nachverarbeitungsmittel (7b) des gleichen Aufbaus wie der des Nachverarbeitungsmittels (7a) zum Empfangen einer Ausgabe des zweiten Matrixproduktmittels (2b, 3b),

wobei eine Ausgabe des zweiten Nachbearbeitungsmittels (7b) Daten bezeichnet, die einer zweidimensionalen IDCT-Verarbeitung von N x N Punkten unterworfen werden.

- 19. Prozessor nach einem der Ansprüche 1 bis 18, bei dem N = 8 ist.
- 20. Prozessor nach einem der Ansprüche 1 bis 19,

55

bei dem der Prozessor (51) integrierend in eine integrierte Schaltung (50) so eingesetzt ist, daß er in Zusammenwirkung mit anderen funktionalen Schaltungen (52, 53, 54; 55, 56) tätig ist.

- 21. Verfahren zum Verarbeiten einer eindimensionalen diskreten Cosinus-Transformation DCT oder einer eindimensionalen inversen diskreten Cosinus-Transformation IDCT von N Punkten X, worin N = 2<sup>m</sup> ist, das m eine natürliche Zahl ist, wobei das Verfahren die Schritte aufweist:
- Erster Schritt des Empfangens von Eingangsdaten X von N Punkten zum Unterteilen der empfangenen Eingangsdaten X in eine erste Menge von N/2 Daten und eine zweite Menge von N/2 Daten, wobei jede Menge Daten in einer vorbestimmten Beziehung in den Eingangsdaten X von N Punkten aufweist;
  - Zweiter Schritt des Ausführens von Addition und Subtraktion von jeden Daten mit zwei Termen in einer vorbestimmten Beziehung in der ersten Menge und in der zweiten Menge in den Eingangsdaten X zum Erzeugen einer ersten Menge von Additionsdaten  $(z_k)$  und einer zweiten Menge von Subtraktionsdaten  $(w_k)$ , wobei die erste und die zweite Menge Daten mit N/2 Termen enthalten, die der DCT-Verarbeitung unterworfen werden; Dritter Schritt des Ausführens einer N/2 Punktproduktoperation der ersten Menge von Additionsdaten  $(z_k)$  und einer ersten Koeffizientenmatrix (B) zum Erzeugen einer ersten Menge von Ausgangsdaten;
  - Vierter Schritt des Ausführens einer N/2 Punktproduktoperation der zweiten Menge von Subtraktionsdaten (w<sub>k</sub>) und einer zweiten Koeffizientenmatrix (B) zum Erzeugen einer zweiten Menge von Ausgangsdaten, wobei der Schritt des Erzeugens der ersten und der zweiten Menge von Ausgangsdaten die Schritte des Erzeugens einer entsprechenden Teilsumme durch Bezugnahme auf ein Tabellenspeichermittel (43) enthält, wobei angelegte Daten als ein Adreßsignal benutzt werden, und Summieren der Teilsummen;
  - Fünfter Schritt des Ausgebens der ersten Menge von Ausgangsdaten und der zweiten Menge von Ausgangsdaten, die der DCT-Verarbeitung unterworfen werden, in einer vorbestimmten Reihenfolge;
  - Sechster Schritt des Ausführens einer N/2 Punktproduktoperation der ersten Menge von Eingangsdaten und einer dritten Koeffizientenmatrix (B') zum Erzeugen einer ersten Menge von Zwischendaten M(i);
  - Siebter Schritt des Ausführens einer N/2 Punktproduktoperation der zweiten Menge von Eingangsdaten und einer vierten Koeffizientenmatrix (B') zum Erzeugen einer zweiten Menge von Zwischendaten N(i);
  - Achter Schritt des Ausführens von Addition und Subtraktion der ersten Menge von Zwischendaten M(i) und der zweiten Menge von Zwischendaten N(i) zum Erzeugen einer Menge von Additionsdaten und einer Menge von Subtraktionsdaten; und
  - Neunter Schritt des Ausgebens der Menge der Additionsdaten und der Menge der Subtraktionsdaten, die der IDCT-Verarbeitung unterworfen werden, in einer vorbestimmten Reihenfolge,
  - wobei der Schritt des Erzeugens der Zwischendaten M(i) und N(i) den Schritt des Erzeugens einer entsprechenden Teilsumme unter Bezugnahme auf ein Tabellenspeichermittel (43) enthält, wobei die angelegten Daten als eine Adresse benutzt werden, und Summieren der Teilsummen für jede entsprechenden Daten, und selektives Freigeben eines der Betriebsverarbeitungen des zweiten bis fünften Schrittes und des sechsten bis neunten Schrittes.
  - 22. Verfahren nach Anspruch 21,

10

15

20

25

30

35

40

45

50

55

- bei dem die Daten mit zwei Termen in der vorbestimmten Beziehung Daten x(i) des (i)ten Termes und Daten x(N-i-1) des (N-i-1)ten Termes sind, wobei das i eine ganze Zahl von  $0 \le i \le N/2-1$  ist.
- 23. Verfahren zum Ausführen einer eindimensionalen inversen diskreten Cosinus-Transformation (IDCT) von N Punkten, wobei das N = 2<sup>m</sup> ist, das m eine natürliche Zahl ist, wobei das Verfahren die Schritte aufweist:
  - Empfangen von Eingangsdaten Y mit N Termen zum Erzeugen einer ersten Menge von Eingangsdaten von Daten Y(2i) mit geraden Termen und einer zweiten Menge von Eingangsdaten von Daten Y(2i+1) von ungeraden Termen, worin das i eine ganze Zahl von  $0 \le 1 \le N/2-1$  ist;
  - Ausführen einer N/2 Termproduktoperation der ersten Menge von Eingangsdaten und einer ersten Koeffizientenmatrix (B') zum Erzeugen einer ersten Menge von Zwischendaten M(i);
  - Ausführen einer N/2 Termproduktoperation der zweiten Menge von Eingangsdaten und einer zweiten Koeffizientenmatrix (B') zum Erzeugen einer zweiten Menge von Zwischendaten N(i);
  - Ausführen von Addition und Subtraktion der ersten Menge von Zwischendaten M(i) und der zweiten Menge von Zwischendaten N(i) zum Erzeugen einer ersten Menge von Additionsdaten und einer zweiten Menge von Subtraktionsdaten; und
  - Ausgeben der ersten Menge der Additionsdaten und der zweiten Menge der Subtraktionsdaten in einer vorbestimmten Reihenfolge, worin der Schritt des Erzeugens der Zwischendaten M(i) und N(i) den Schritt des Erzeugens einer entsprechenden Teilsumme durch Bezugnahme auf ein Tabellenspeichermittel (43) enthält, wobei die angelegten Daten als eine Adresse benutzt werden, und Summieren der Teilsummen für jede entsprechenden Zwischendaten.

# 24. Verfahren nach Anspruch 23,

worin die Additionsdaten eine Summe von Daten M(i) und Daten N(i) sind, wobei die Additionsdaten von M(i) + N(i) die Daten X(i) des i-ten Termes von Ausgangsdaten mit N Termen bezeichnen; und die Subtraktionsdaten eine Differenz zwischen Daten M(i) und Daten N(i) sind, wobei die Subtraktionsdaten M(i) - N(i) Ausgangsdaten X(N-i-1) des (N-i-1)ten Termes vorsehen.

### Revendications

5

20

25

30

40

- 10 1. Processeur ayant au moins une fonction de réalisation d'une transformée unidimensionnelle discrète de cosinus DCT et une transformée unidimensionnelle inverse discrète de cosinus IDCT de données d'entrée de N termes X, où ledit N est un entier positif, ledit processeur comprenant :
- un moyen de pré-traitement (1 ; 1a, 1b) pour réaliser l'addition et la soustraction pour chacun des jeux des deux termes prédéterminés desdites données d'entrée X pour générer un premier jeu des données d'addition (z<sub>k</sub>) et un second jeu des données de soustraction (w<sub>k</sub>) soumis au traitement DCT ; et un moyen de produit de matrice (2, 3 ; 2a, 3a ; 2b, 3b) pour obtenir un premier produit matrice dudit premier jeu des données (z<sub>k</sub>) à partir dudit moyen de pré-traitement (1 ; 1a, 1b) et une première matrice de coefficient prédéterminé (B), et un second produit de matrice dudit second jeu des données (w<sub>k</sub>) et une seconde matrice
  - une entrée dudit moyen de produit de matrice fournit des données de sortie de N termes soumises à un traitement DCT.
  - ledit moyen de produit de matrice comprenant un moyen de mémoire de table (43) stockant des données de somme partielle de produits pour des sorties respectives de celui-ci à des adresses fournies par des données recues par ledit moyen de pré-traitement, et
  - un moyen d'accumulation (42) pour additionner des données reçues dudit moyen de mémoire de table (43), ledit moyen de produit de matrice réalisant des opérations de somme de produits de N/2 points,
  - une section de post-traitement (7 ; 7a, 7b) pour recevoir une sortie dudit moyen de produit de matrice pour réaliser l'addition et la soustraction des données de deux termes prédéterminés des données de N termes reçues et générer des premier et second jeux (x2i, x2i + 1) des données de sortie Yi, où la sortie dudit moyen de post-traitement fournissant des données de sortie de N termes soumises au traitement IDCT, et
  - un moyen de commande (8) pour valider sélectivement une de ladite section de pré-traitement (1; 1a, 1b) et de ladite section de post-traitement (7; 7a, 7b).
- 2. Processeur selon la revendication 1, où
  - ledit moyen de pré-traitement (1) comprend :

de coefficient prédéterminé (B), où

- un moyen de génération de jeu (21) pour générer un jeu de données de  $p^{ième}$  terme X(p) et des données de  $q^{ième}$  terme X(q) desdites données d'entrée X, où p + q = N-1,  $0 \le p < q \le N-1$ , et p et q sont des entiers ; un moyen d'addition (22 ; 26) pour réaliser l'addition des données de deux termes fournies par ledit moyen de génération de jeu (21) ; et
- un moyen de soustraction (23, 26) pour réaliser la soustraction des données de deux termes fournies par ledit moyen de génération de jeu (21).
- 45 3. Processeur selon la revendication 1 ou 2, dans lequel ledit moyen de produit de matrice (2, 3) comprend
  - un moyen de stockage (32) pour recevoir séquentiellement ledit premier jeu de données à partir dudit moyen de pré-traitement (1) pour y stocker les données reçues, chacun dudit premier jeu de données ayant une pluralité de bits ; et
- un moyen de lecture parallèle (33) pour lire, en parallèle et dans l'ordre, des données de un bit dans la même figure de bit de tout ledit premier jeu de données stocké dans ledit moyen de stockage (32).
  - 4. Processeur selon l'une des revendications 1 à 3, dans lequel
- ledit moyen de produit de matrice (2, 3) comprend un moyen de stockage (32) pour recevoir séquentiellement ledit second jeu de données à partir dudit moyen de pré-traitement (1) pour y stocker les données reçues, chacune dudit second jeu de données ayant une pluralité de bits ; et

un moyen de lecture parallèle (33) pour lire en parallèle et dans l'ordre, des données de un bit sur la même figure de bit de tout ledit second jeu de données stocké dans ledit moyen de stockage (32).

5. Processeur selon la revendication 3 ou 4, dans lequel

5

10

20

25

30

35

40

45

50

55

ledit moyen de produit de matrice (2, 3) comprend en outre une pluralité de premiers moyens d'opération de somme de produits (6a à 6d), et dans lequel chaque dit moyen d'opération de somme de produit comprend ledit moyen de mémoire de table (43) recevant des données de bit parallèle à partir dudit moyen de lecture parallèle (33) comme un signal d'adresse pour fournir une somme partielle correspondante, ledit moyen de mémoire de table (43) stockant à l'avance la somme de produit d'un coefficient correspondant et lesdites données de bit parallèle sous la forme d'une table, et ledit moyen d'accumulation (42) accumulant des sorties dudit moyen de mémoire de table (43), ledit moyen d'accumulation (42) fournissant un premier jeu de données de sortie desdites données de sortie de N termes.

6. Processeur selon l'une quelconque des revendications 1 à 5, dans lequel

ledit moyen de produit de matrice (2, 3) comprend en outre : une pluralité de seconds moyens d'opération de somme de produits (6e à 6h), et dans lequel chacun dudit second moyen d'opération de somme de produits comprend :

ledit moyen de mémoire de table (43) pour recevoir des données de bit parallèle dudit moyen de lecture parallèle (33) comme un signal d'adresse pour fournir une somme partielle correspondante, ledit moyen de mémoire de table (43) stockant à l'avance une somme de produits d'un coefficient correspondant et lesdites données de bit parallèle sous la forme d'une table, et

ledit moyen d'accumulation (42) accumulant des sorties dudit moyen de mémoire de table (43) pour générer un second jeu de données desdites données de sortie de N termes.

7. Processeur selon l'une quelconque des revendications 5 ou 6, dans lequel

ledit moyen d'accumulation (42) comprend un moyen d'addition (44) pour recevoir une sortie dudit moyen de mémoire de table (43) sur son entrée, un moyen de registre (45) pour stocker temporairement une sortie dudit moyen d'addition (44), et un moyen de décalage (46) pour décaler les données de stockage dans ledit moyen de registre (45) par un bit pour appliquer les données décalées à l'autre entrée dudit moyen d'addition (44), une sortie finale dudit moyen de décalage fournissant ledit premier ou second jeu de données desdites données de sortie de N termes.

8. Processeur selon l'une quelconque des revendications 1 à 7, dans lequel :

ladite section de post-traitement (7) comprend un moyen (22, 23 ; 26) pour réaliser l'addition et la soustraction des données de (2i)  $^{i i}$  terme Y(2i) et des données de (2i + 1)  $^{i i}$  terme Y (2i + 1) des données de sortie de N termes Y dudit moyen de produit de matrice (2, 3) dans lequel ledit i est un entier de 0  $\Box$  i  $\Box$  N/2 - 1.

9. Processeur selon la revendication 8, dans lequel

l'addition desdites données Y (2i) et Y (2i + 1) indiquent les données de sortie du (i)<sup>lème</sup> terme Z(i), et la soustraction desdites données Y (2i) et Y (2i + 1) indiquent les données de sortie du (N - i - 1)<sup>lème</sup> terme Z (N - i - 1).

10. Processeur selon l'une quelconque, des revendications 1 à 9, comprenant en outre :

un moyen de transposition (12) pour recevoir séquentiellement des données de sortie (Rk) dudit moyen de produit de matrice (2, 3) pour y stocker les données reçues, transposer une matrice formée par les données stockées et fournir séquentiellement des données intermédiaires de N termes; un second moyen de pré-traitement (1b) ayant la même configuration que celle dudit moyen de prétraitement (1a), pour recevoir une sortie dudit moyen de transposition (12) pour réaliser l'addition et la soustraction pour chacun des jeux de deux termes prédéterminés desdites données intermédiaires de N termes; et un second moyen de produit de matrice (2b, 3b) ayant la même configuration que celle dudit moyen de produit de matrice (2a, 3a), pour réaliser une opération de produit de données de sortie dudit second moyen de prétraitement (1b) et une seconde matrice de coefficient prédéterminé, une sortie dudit second moyen de produit de matrice (2b, 3b) indiquant des données soumises à un traitement DCT bi-dimensionnel.

11. Processeur selon la revendication 10, comprenant en outre :

5

10

15

20

25

30

35

40

45

50

55

un second moyen de post-traitement (7b) ayant la même configuration que celle dudit moyen de post-traitement (7a), pour recevoir une sortie dudit second moyen de produit de matrice (2b, 3b); et un second moyen de commande (8) pour valider un dudit second moyen de pré-traitement (1b) et dudit second moyen de post-traitement (7b).

12. Processeur ayant au moins une fonction pour réaliser une transformée inverse unidimensionnelle discrète de cosinus (IDCT) des données d'entrée de N termes Y, dans lequel ledit nombre N est un entier positif, ledit: processeur comprenant :

un moyen de produit de matrice (2,3;2a,3a,2b,3b) pour diviser lesdites données d'entrée de N termes Y en un premier jeu de données d'entrée et un second jeu de données d'entrée et réaliser une opération de produit de N/2 points dudit premier jeu de données d'entrée et une première matrice de coefficient (B') et une opération de produit de N/2 points dudit second jeu de données d'entrée et une seconde matrice de coefficient (B'), pour générer un premier jeu de données intermédiaires Mi et un second jeu de données intermédiaires Ni, dans lequel ledit i est un entier de  $0 \le i \le N/2 - 1$ ,

ledit moyen de produit de matrice comprenant un moyen de mémoire de table (43) stockant lesdites données de somme partielle de produits pour les données intermédiaires respectives à des adresses fournies par les données d'entrée, et un moyen d'accumulation (42) pour additionner les données reçues dudit moyen de mémoire de table (43) ; et

un moyen de post-traitement (7 ; 7a, 7b) pour réaliser l'addition et la soustraction de deux données intermédiaires dans une relation prédéterminée dans ledit premier jeu de données intermédiaires Mi et ledit second jeu de données intermédiaires Ni dudit moyen de produit de matrice pour générer des premier et second jeux de données de sortie Xi.

13. Processeur selon la revendication 12, dans lequel

ledit moyen de post-traitement (7 ; 7a, 7b) comprend un moyen (22, 23 ; 26) pour réaliser l'addition et la soustraction dudit premier jeu de données intermédiaires du (i)<sup>ième</sup> terme Mi et ledit second jeu de données intermédiaires du (i)<sup>ième</sup> terme Ni ; et

des données d'addition (Mi + Ni) indiquent des données du (i)<sup>ième</sup> terme des données de sortie de N termes, et des données de soustraction (Mi - Ni) indiquent des données du (N - i - 1)<sup>ième</sup> terme desdites données de sortie de N termes.

14. Processeur selon la revendication 12 ou 13, dans lequel

chacune desdites données intermédiaires est représentée par une pluralité de bits, et ledit moyen de produit de matrice (2, 3) comprend

un moyen de stockage (32) pour diviser lesdites données d'entrée de N termes Y en un premier jeu de données d'entrée Y (2i) et un second jeu de données d'entrée Y (2i + 1) pour y stocker chaque jeu de données d'entrée.

un premier moyen de lecture (33) pour lire en parallèle des données de un bit dans le même ordre dudit premier jeu de toutes les données d'entrée Y (2i) à partir dudit moyen de stockage (32),

un second moyen de lecture (33) pour lire en parallèle des données de un bit dans la même figure de bit dudit second jeu de toutes les données d'entrée Y (2i + 1) à partir dudit moyen de stockage (32),

un premier moyen d'opération de somme de produits (6a à 6d) pour réaliser une opération de somme de produits des données de bit parallèle dudit premier moyen de lecture (33) et un coefficient correspondant de ladite première matrice de coefficient, pour générer ledit premier jeu de données de sortie Xi, et

un second moyen d'opération de somme de produits (6e à 6h) pour réaliser une opération de somme de produits de données de bit parallèle à partir dudit second moyen de lecture (33) et un coefficient correspondant de ladite seconde matrice de coefficient, pour générer ledit second jeu de données de sortie Xi.

15. Processeur selon la revendication 14, dans lequel

lesdits premier et second moyens d'opération de somme de produits comprennent une pluralité de circuits d'opération chacun lié à un terme desdites données de sortie Xi, chacun desdits circuits d'opération (6a à 6h) comprenant

ledit moyen de mémoire de table (43) recevant lesdites données de bit parallèle comme un signal d'adresse pour fournir le résultat de ladite opération de somme de produit avec le coefficient correspondant,

ledit moyen de mémoire de table (43) stockant à l'avance les données indiquant le résultat de l'opération de somme de produits sous la forme d'une table, et

ledit moyen d'accumulation (42) accumulant des sorties dudit moyen de mémoire de table (43).

16. Processeur selon la revendication 15, dans lequel

5

10

15

20

25

30

35

40

50

55

ment (7).

ledit moyen d'accumulation (42) comprend

un moyen d'addition de deux entrées (44) pour recevoir une sortie dudit moyen de mémoire de table (43) sur son entrée.

un moyen de registre (45) pour stocker temporairement une sortie dudit moyen d'addition, et

un moyen de décalage (46) pour décaler des données stockées dans ledit moyen de registre (45) par un bit pour appliquer les données décalées à l'autre entrée dudit moyen d'addition (44), une sortie finale dudit moyen de décalage (46) indiquant des données de sortie d'un terme associé.

17. Processeur selon l'une des revendications 12 à 16, comprenant en outre :

un moyen de pré-traitement (1 ; 1a, 1b) pour réaliser l'addition et la soustraction d'un jeu prédéterminé de données de deux termes Y (j), Y (N - j - 1) desdites données d'entrée de deux termes Y pour générer un premier jeu de données d'addition et un second jeu de données de soustraction, ledit premier jeu desdites données d'addition et ledit second jeu desdites données de soustraction étant appliqués comme lesdits premier et second jeux des données d'entrée audit moyen de produit de matrice (2, 3) ; et un moyen de commande (8) pour valider un dudit moyen de pré-traitement (1) et dudit moyen de post-traite-

18. Processeur selon l'une quelconque des revendications 12 à 17, comprenant en outre :

un moyen de transposition (12) pour recevoir séquentiellement

N données de sortie de N termes dudit moyen de post-traitement (7a) pour y stocker lesdites données reçues, puis transposer les données stockées et fournir les données transposées ;

un second moyen de produit de matrice (2b, 3b) de la même configuration que celle dudit moyen de produit de matrice (2a, 3a), pour recevoir une sortie dudit moyen de transposition (12); et

un second moyen de post-traitement (7b) de la même configuration que celle dudit moyen de post-traitement (7a), pour recevoir une sortie dudit second moyen de produit de matrice (2b, 3b),

une sortie dudit second moyen de post-traitement (7b) indiquant des données soumises à un traitement IDCT bi-dimensionnel de N par N points.

- 19. Processeur selon l'une quelconque des revendications 1 à 18, dans lequel ledit nombre N vaut 8.
- 20. Processeur selon l'une quelconque des revendications 1 à 19, dans lequel ledit processeur (51) est incorporé intégralement dans un circuit intégré (50) afin de fonctionner en coopération avec d'autres circuits fonctionnels (52, 53, 54 ; 55, 56).
- 21. Procédé de traitement de transformée unidimensionnelle discrète de cosinus DCT ou de transformée unidimensionnelle inverse discrète de cosinus IDCT de N points, X, dans lequel ledit nombre N est 2<sup>m</sup>, ledit m étant un nombre naturel, ledit procédé comprenant les étapes de :

une première étape de réception de données d'entrée de N points X pour diviser les données d'entrée reçues X en un premier jeu de N/2 données et un second jeu de N/2 données, chaque jeu comprenant des données dans une relation prédéterminée dans lesdites données d'entrée de N points X;

une seconde étape de réalisation d'addition et de soustraction de chacune des données de deux termes dans une relation prédéterminée pour un premier jeu et un second jeu dans lesdites données d'entrée X, pour générer un premier jeu de données d'addition  $(z_k)$  et un second jeu de données de soustraction  $(w_k)$ , lesdits premier et second jeux comprenant des données de N/2 termes soumises à un traitement DCT;

une troisième étape de réalisation d'une opération de produit de N/2 points dudit premier jeu de données d'addition ( $z_k$ ) et une première matrice de coefficient (B) pour générer un premier jeu de données de sortie ; une quatrième étape de réalisation d'une opération de produit de N/2 points dudit second jeu de données de

soustraction (w<sub>k</sub>) et une seconde matrice de coefficient (B) pour générer un second jeu de données de sortie, ladite étape de génération desdits premier et second jeux des données de sortie comprend l'étape de génération d'une somme partielle correspondante en référence à un moyen de mémoire de table (43), utilisant des données appliquées comme un signal d'adresse, et d'addition des sommes partielles;

une cinquième étape de fourniture du premier jeu des données de sortie et dudit second jeu des données de sortie soumises à un traitement DCT dans un ordre prédéterminé ;

une sixième étape de réalisation d'une opération de produit de N/2 points dudit premier jeu de données d'entrée et une troisième matrice de coefficient (B') pour générer un premier jeu de données intermédiaires M(i):

une septième étape de réalisation d'une opération de produit de N/2 points dudit second jeu de données d'entrée et une quatrième matrice de coefficient (B') pour générer un second jeu de données intermédiaires N(i);

une huitième étape de réalisation d'addition et de soustraction dudit premier jeu de données intermédiaires M(i) et dudit second jeu de données intermédiaires N(i) pour générer un second jeu de données d'addition et un jeu de données de soustraction ; et

une neuvième étape de fourniture dudit jeu desdites données d'addition et dudit jeu desdites données de soustraction soumises à un traitement IDCT dans un ordre prédéterminé,

ladite étape de génération desdites données intermédiaires M(i) et N(i) comprend l'étape de génération d'une somme partielle correspondante en référence à un moyen de mémoire de table (43), utilisant les données appliquées comme une adresse, et d'addition des sommes partielles pour chacune des données intermédiaires respectives, et

validation sélectivement d'un traitement opérationnel de la seconde jusqu'à la cinquième étapes et de la sixième jusqu'à la neuvième étapes.

22. Procédé selon la revendication 21, dans lequel

lesdites données de deux termes dans ladite relation prédéterminée sont des données de (i) termes x (i) et données de (N - i - 1) termes x (N - i - 1), dans lesquelles ledit i est un entier de  $0 \le i \le N/2 - 1$ .

23. Procédé de réalisation d'une transformée unidimensionnelle inverse discrète de cosinus (IDCT) de N points, dans lequel N vaut 2<sup>m</sup>, ledit m étant un nombre naturel, ledit procédé comprenant les étapes de :

réception des données d'entrée de N termes Y pour générer un premier jeu de données d'entrée de données de terme pair Y(2i) et un second jeu de données d'entrée de données d'entrée de terme impair Y(2i+1), dans lequel ledit i est un entier de  $0 \le 1 \le N/2 - 1$ ;

réalisation d'une opération de produit de N/2 termes dudit premier jeu de données d'entrée et une première matrice de coefficient (B') pour générer un premier jeu de données intermédiaires M(i);

réalisation d'une opération de N/2 termes dudit second jeu des données d'entrée et une seconde matrice de coefficient (B') pour générer un second jeu de données intermediaires N(i);

réalisation d'addition et de soustraction dudit premier jeu de données intermédiaires M(i) et dudit second jeu de données intermédiaires N(i) pour générer un premier jeu de données d'addition et un second jeu de données de soustraction ; et

fourniture dudit premier jeu desdites données d'addition et dudit second jeu desdites données de soustraction dans un ordre prédéterminé, dans lequel

ladite étape de génération desdites données intermédiaires M(i) et N(i) comprend les étapes de génération d'une somme partielle correspondante en référence au moyen de mémoire de table (43), utilisant les données appliquées comme une adresse, et d'addition des sommes partielles pour chacune des données intermédiaires respectives.

24. Procédé selon la revendication 23, dans lequel

lesdites données d'addition sont une somme de données M(i) et de données N(i), lesdites données d'addition de M(i) + N(i) indiquant les données du  $i^{iame}$  terme X(i) des données de sortie de N termes ; et lesdites données de soustraction sont une différence entre les données M(i) et les données N(i), lesdites données de soustraction de M(i) - N(i) fournissant les données de sortie du  $(N - i - 1)^{iame}$  terme X(N - i - 1).



FIG.2



FIG. 3



FIG. 4







FIG.7A



FIG. 7B



FIG.8



FIG.9







FIG. 11

DATA REARRANGING CKT

6a 6b 6c 6d 6e 6f 6g 6h

SOUTH STORY OF THE PROPERTY OF

FIG. 12 PREPROCESSING SECTION DATA PEARRANGING CKT CONTROL CKT -6a .6b \_6c 6d \_6e ,6f ,6g 6h POSTPROCESSING SECTION



FIG. 14



FIG. 15



FIG.16



FIG.17

